Donald Sharp [Mon, 11 Dec 2023 15:46:53 +0000 (10:46 -0500)]
bgpd: Make `suppress-fib-pending` clear peering
When a peer has come up and already started installing
routes into the rib and `suppress-fib-pending` is either
turned on or off. BGP is left with some routes that
may need to be withdrawn from peers and routes that
it does not know the status of. Clear the BGP peers
for the interesting parties and let's let us come
up to speed as needed.
Signed-off-by: Donald Sharp <sharpd@nvidia.com> Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
bgpd
Check mandatory attributes more carefully for the UPDATE message
Do not suppress conditional advertisement updates if triggered
Fix crash in SNMP BGP4V2-MIB bgpv2PeerErrorsTable()
Handle MP_UNREACH_NLRI malformed packets with session reset
Ignore handling NLRIs if we received the MP_UNREACH_NLRI attribute
Initialise timebuf arrays to zeros for dampening reuse timer
Initialise buffer in bgp_notify_admin_message() before using it
Make sure dampening is enabled for the specified AFI/SAFI
Use proper AFI when dumping information for dampening stuff
Treat EOR as withdrawn to avoid unwanted handling of malformed attrs
eigrpd
Use the correct memory pool on interface deletion
vtysh
Fix show route map JSON output
ospfd
Fix infinite loop when listing OSPF interfaces
pbrd
Fix show pbr map detail json output
zebra
Add encap type when building packet for FPM
Display ptmStatus order in interface JSON
Fix connected route deletion when multiple entry exists
Fix FPM multipath encap addition
Fix link update for veth interfaces
Fix zebra crash when replacing nhe during shutdown
Prevent null pointer dereference
Donatas Abraitis [Sun, 29 Oct 2023 20:44:45 +0000 (22:44 +0200)]
bgpd: Ignore handling NLRIs if we received MP_UNREACH_NLRI
If we receive MP_UNREACH_NLRI, we should stop handling remaining NLRIs if
no mandatory path attributes received.
In other words, if MP_UNREACH_NLRI received, the remaining NLRIs should be handled
as a new data, but without mandatory attributes, it's a malformed packet.
In normal case, this MUST not happen at all, but to avoid crashing bgpd, we MUST
handle that.
Donatas Abraitis [Fri, 27 Oct 2023 08:56:45 +0000 (11:56 +0300)]
bgpd: Treat EOR as withdrawn to avoid unwanted handling of malformed attrs
Treat-as-withdraw, otherwise if we just ignore it, we will pass it to be
processed as a normal UPDATE without mandatory attributes, that could lead
to harmful behavior. In this case, a crash for route-maps with the configuration
such as:
Donald Sharp [Fri, 17 Nov 2023 21:57:20 +0000 (16:57 -0500)]
zebra: Fix fpm multipath encap addition
The fpm code path in building a ecmp route for evpn has
a bug that caused it to not add the encap attribute to
the netlink message. See #f0f7b285b99dbd971400d33feea007232c0bd4a9
for the single path case being fixed.
Donald Sharp [Sat, 28 Oct 2023 14:03:39 +0000 (10:03 -0400)]
zebra: Add encap type when building packet for FPM
Currently in the single nexthop case w/ evpn sending
down via the FPM the encap type is not being set
for the nexthop.
This looks like the result of some code reorg for the
nexthop happened but the fpm failed to be accounted for.
Let's just move the encap type encoding to where it
will happen.
'detail' and 'josn' keyword is given as an optional parameter
for cli arguments. Hence 'detail' keyword was consider as a
pbr 'name' for "show pbr map detail json" command.
Donatas Abraitis [Mon, 23 Oct 2023 20:34:10 +0000 (23:34 +0300)]
bgpd: Check mandatory attributes more carefully for UPDATE message
If we send a crafted BGP UPDATE message without mandatory attributes, we do
not check if the length of the path attributes is zero or not. We only check
if attr->flag is at least set or not. Imagine we send only unknown transit
attribute, then attr->flag is always 0. Also, this is true only if graceful-restart
capability is received.
Thread 1 "bgpd" received signal SIGSEGV, Segmentation fault.
0x00005555556e37be in route_set_aspath_prepend (rule=0x555555aac0d0, prefix=0x7fffffffe050,
object=0x7fffffffdb00) at bgpd/bgp_routemap.c:2282
2282 if (path->attr->aspath->refcnt)
(gdb)
```
Donatas Abraitis [Fri, 20 Oct 2023 08:59:59 +0000 (11:59 +0300)]
bgpd: Do not suppress conditional advertisement updates if triggered
If we have a prefix-list with one entry, and after some time we append a prefix-list
with some more additional entries, conditional advertisement is triggered, and the
old entries are suppressed (because they look identical as sent before).
Hence, the old entries are sent as withdrawals and only new entries sent as updates.
Force re-sending all BGP updates for conditional advertisement. The same is done
for route-refresh, and/or soft clear operations.
zebra: Fix connected route deletion when multiple entry exists
When multiple interfaces have addresses in the same network, deleting
one of them may cause the wrong connected route being deleted.
For example:
ip link add veth1 type veth peer veth2
ip link set veth1 up
ip link set veth2 up
ip addr add dev veth1 192.168.0.1/24
ip addr add dev veth2 192.168.0.2/24
ip addr flush dev veth1
Zebra deletes the route of interface veth2 rather than veth1.
Should match nexthop against ere->re_nhe instead of ere->re->nhe.
ospfd: Fixing infinite loop when listing OSPF interfaces
The problem was happening because the ospf->oiflist has this behaviour, each interface was removed and added at the end of the list in each ospf_network_run_subnet call, generation an infinite loop.
As a solution, a copy of the list was generated and we interacted with a fixed list.
following crash occurs:
at ./nptl/pthread_kill.c:44
at ./nptl/pthread_kill.c:78
at ./nptl/pthread_kill.c:89
context=0x7ffd06d3d300)
at /build/make-pkg/output/_packages/cp-routing/src/lib/sigevent.c:246
length=0x7ffd06d3da88, exact=1, var_len=0x7ffd06d3da90, write_method=<optimized out>)
at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_snmp_bgp4v2.c:364
vp=vp@entry=0x7f7c88b584c0 <bgpv2_variables>, vp_len=vp_len@entry=102,
ename=ename@entry=0x7f7c88b58440 <bgpv2_trap_oid>, enamelen=enamelen@entry=8,
name=name@entry=0x7f7c88b58480 <bgpv2_oid>, namelen=namelen@entry=7,
iname=0x7ffd06d3e7b0, index_len=1, trapobj=0x7f7c88b53b80 <bgpv2TrapBackListv6>,
trapobjlen=6, sptrap=2 '\002')
at /build/make-pkg/output/_packages/cp-routing/src/lib/agentx.c:382
vp_len=vp_len@entry=102, ename=ename@entry=0x7f7c88b58440 <bgpv2_trap_oid>,
enamelen=enamelen@entry=8, name=name@entry=0x7f7c88b58480 <bgpv2_oid>,
namelen=namelen@entry=7, iname=0x7ffd06d3ec30, inamelen=16,
trapobj=0x7f7c88b53b80 <bgpv2TrapBackListv6>, trapobjlen=6, sptrap=2 '\002')
at /build/make-pkg/output/_packages/cp-routing/src/lib/agentx.c:298
at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_snmp_bgp4v2.c:1496
at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_fsm.c:48
at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_fsm.c:1314
event=Receive_NOTIFICATION_message)
at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_fsm.c:2665
at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_packet.c:3129
at /build/make-pkg/output/_packages/cp-routing/src/lib/event.c:1979
at /build/make-pkg/output/_packages/cp-routing/src/lib/libfrr.c:1213
at /build/make-pkg/output/_packages/cp-routing/src/bgpd/bgp_main.c:510
it's due to function bgpv2PeerErrorsTable returning
return SNMP_STRING(msg_str);
with msg_str NULL rather the string ""
bgpd: Initialise timebuf arrays to zeros for dampening reuse timer
Avoid having something like this in outputs:
Before:
```
munet> r1 shi vtysh -c 'show bgp dampening damp'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
```
munet> r1 shi vtysh -c 'show bgp dampening damp '
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
munet> r1 shi vtysh -c 'show bgp dampening flap'
BGP table version is 10, local router ID is 10.10.10.1, vrf id 0
Default local pref 100, local AS 65001
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Jonas Gorski [Thu, 14 Sep 2023 15:04:16 +0000 (17:04 +0200)]
tools: make --quiet actually suppress output
When calling daemon_stop() with --quiet and e.g. the pidfile is empty,
it won't return early since while "$fail" is set, "$2" is "--quiet", so
the if condition isn't met and it will continue executing, resulting
in error messages in the log:
> Sep 14 14:48:33 localhost watchfrr[2085]: [YFT0P-5Q5YX] Forked background command [pid 2086]: /usr/lib/frr/watchfrr.sh restart all
> Sep 14 14:48:33 localhost frrinit.sh[2075]: /usr/lib/frr/frrcommon.sh: line 216: kill: `': not a pid or valid job spec
> Sep 14 14:48:33 localhost frrinit.sh[2075]: /usr/lib/frr/frrcommon.sh: line 216: kill: `': not a pid or valid job spec
> Sep 14 14:48:33 localhost frrinit.sh[2075]: /usr/lib/frr/frrcommon.sh: line 216: kill: `': not a pid or valid job spec
Fix this by moving the --quiet check into the block to log_failure_msg(),
and also add the check to all other invocations of log_*_msg() to make
--quiet properly suppress output.
Fixes: 19a99d89f088 ("tools: suppress unuseful warnings during restarting frr") Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
(cherry picked from commit 312d5ee1592f8c5b616d330233d1de2643f759e2)
Donald Sharp [Wed, 6 Sep 2023 12:39:02 +0000 (08:39 -0400)]
zebra: Prevent Null pointer deref
If the kernel sends us bad data then the kind_str
will be NULL and a later strcmp operation will
cause a crash.
As a note: If the kernel is not sending us properly
formated netlink messages then we got bigger problems
than zebra crashing. But at least let's prevent zebra
from crashing.
Reported-by: Iggy Frankovic <iggyfran@amazon.com> Signed-off-by: Donald Sharp <sharpd@nvidia.com>
(cherry picked from commit 2b9373c114dfc0154f6291474789f44256358518)
Rajasekar Raja [Thu, 17 Aug 2023 07:47:05 +0000 (00:47 -0700)]
zebra: Fix zebra crash when replacing NHE during shutdown
During replace of a NHE from upper proto in zebra_nhg_proto_add(),
- rib_handle_nhg_replace() is invoked with old NHE where we walk all
RNs/REs & replace the re->nhe whose address points to old NHE.
- In this walk, if prev re->nhe refcnt is decremented to 0, we free up
the memory which the old NHE is pointing to.
Later in zebra_nhg_proto_add(), we end up accessing this freed memory
and crash.
Backtrace:
0Â 0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6
1Â 0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
2Â 0x00007f833f636648 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
3Â 0x00007f833f63cd6a in ?? () from /lib/x86_64-linux-gnu/libc.so.6
4Â 0x00007f833f63cfb4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
5Â 0x00007f833f63fbc8 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
6Â 0x00007f833f64172a in malloc () from /lib/x86_64-linux-gnu/libc.so.6
7Â 0x00007f833f6c3fd2 in backtrace_symbols () from /lib/x86_64-linux-gnu/libc.so.6
8Â 0x00007f833f9013fc in zlog_backtrace_sigsafe (priority=priority@entry=2, program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:222
9Â 0x00007f833f901593 in zlog_signal (signo=signo@entry=6, action=action@entry=0x7f833f988ee8 "aborting...", siginfo_v=siginfo_v@entry=0x7ffee1ce4a30,
  program_counter=program_counter@entry=0x7f833f5f48eb <raise+267>) at lib/log.c:154
10 0x00007f833f92dbd1 in core_handler (signo=6, siginfo=0x7ffee1ce4a30, context=<optimized out>) at lib/sigevent.c:254
11 <signal handler called>
12 0x00007f833f5f48eb in raise () from /lib/x86_64-linux-gnu/libc.so.6
13 0x00007f833f5df535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
14 0x00007f833f958f96 in _zlog_assert_failed (xref=xref@entry=0x7f833f9e4080 <_xref.10705>, extra=extra@entry=0x0) at lib/zlog.c:680
15 0x00007f833f905400 in mt_count_free (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:84
16 mt_count_free (ptr=0x51, mt=0x7f833fa02800 <MTYPE_NH_LABEL>) at lib/memory.c:80
17 qfree (mt=0x7f833fa02800 <MTYPE_NH_LABEL>, ptr=0x51) at lib/memory.c:140
18 0x00007f833f90799c in nexthop_del_labels (nexthop=nexthop@entry=0x56091d776640) at lib/nexthop.c:563
19 0x00007f833f907b91 in nexthop_free (nexthop=0x56091d776640) at lib/nexthop.c:393
20 0x00007f833f907be8 in nexthops_free (nexthop=<optimized out>) at lib/nexthop.c:408
21 0x000056091c21aa76 in zebra_nhg_free_members (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628
22 zebra_nhg_free (nhe=0x56091d890840) at zebra/zebra_nhg.c:1628
23 0x000056091c21bab2 in zebra_nhg_proto_add (id=<optimized out>, type=9, instance=<optimized out>, session=0, nhg=nhg@entry=0x56091d7da028, afi=afi@entry=AFI_UNSPEC)
  at zebra/zebra_nhg.c:3532
24 0x000056091c22bc4e in process_subq_nhg (lnode=0x56091d88c540) at zebra/zebra_rib.c:2689
25 process_subq (qindex=META_QUEUE_NHG, subq=0x56091d24cea0) at zebra/zebra_rib.c:3290
26 meta_queue_process (dummy=<optimized out>, data=0x56091d24d4c0) at zebra/zebra_rib.c:3343
27 0x00007f833f9492c8 in work_queue_run (thread=0x7ffee1ce55a0) at lib/workqueue.c:285
28 0x00007f833f93f60d in thread_call (thread=thread@entry=0x7ffee1ce55a0) at lib/thread.c:2008
29 0x00007f833f8f9888 in frr_run (master=0x56091d068660) at lib/libfrr.c:1223
30 0x000056091c1b8366 in main (argc=12, argv=0x7ffee1ce5988) at zebra/main.c:551
bgpd
Add peers back to peer hash when peer_xfer_conn fails
Do not explicitly print maxttl value for ebgp-multihop vty output
Do not process nlris if the attribute length is zero
Do not try to redistribute routes if we are shutting down
Don't read the first byte of orf header if we are ahead of stream
Evpn code was not properly unlocking rd_dest
Fix `show bgp all rpki notfound`
Fix session reset issue caused by malformed core attributes
Free bgp vpn policy
Free previously dup'ed aspath attribute for aggregate routes
Free temporary memory after using argv_concat()
Intern attributes before putting into rib-out
Make sure we have enough data to read two bytes when validating aigp
Prevent use after free
Rfapi memleak fixes, clean ce tables at exit
Unlock dest if we return earlier for aggregate install
Use treat-as-withdraw for tunnel encapsulation attribute
lib
Allow unsetting walltime-warning and cpu-warning
Skip route-map optimization if !af_inet(6)
Use max_bitlen instead of magic number
ospf6d
Fix crash because neighbor structure was freed
Stop crash in ospf6_write
ospfd
Check for nulls in vty code
Prevent use after free( and crash of ospf ) when no router ospf
pbrd
Fix crash with match command
pimd
Prevent crash when receiving register message when the rp() is unknown
When receiving a packet be more careful with length in pim_pim_packet
ripd, ripngd
Revert "Cleanup memory allocations on shutdown"
tools
Add what frr thinks as the fib routes for support_bundle
vtysh
Print uniq lines when parsing `no service ...`
zebra
Abstract `dplane_ctx_route_init` to init route without copying
Fix crash when `dplane_fpm_nl` fails to process received routes
Further handle route replace semantics
Fix command ipv6 nht xxx
Fix evpn nexthop config order
Donald Sharp [Wed, 30 Aug 2023 11:25:06 +0000 (07:25 -0400)]
bgpd: Add peers back to peer hash when peer_xfer_conn fails
It was noticed that occassionally peering failed in a testbed
upon investigation it was found that the peer was not in the
peer hash and we saw these failure messages:
Aug 25 21:31:15 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: %NOTIFICATION: sent to neighbor 2001:cafe:1ead:4::4 4/0 (Hold Timer Expired) 0 bytes
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] Can't get remote address and port: Transport endpoint is not connected
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 100663299] %bgp_getsockname() failed for peer 2001:cafe:1ead:4::4 fd 27 (from_peer fd -1)
Aug 25 21:31:22 doca-hbn-service-bf3-s06-1-ipmi bgpd[3048]: [EC 33554464] %Neighbor failed in xfer_conn
Upon looking at the code the peer_xfer_conn function can fail
and the bgp_establish code will then return before adding the
peer back to the peerhash.
This is only part of the failure. The peer also appears to
be in a state where it is no longer initiating connection attempts
but that will be another commited fix when we figure that one out.
The command "show bgp all rpki notfound" includes not only RPKI
notfound routes but also RPKI valid and invalid routes in its results.
Fix the code to display only RPKI notfound routes.
Old output:
```
frr# show bgp all rpki notfound
For address family: IPv4 Unicast
BGP table version is 0, local router ID is 10.0.0.1, vrf id 0
Default local pref 100, local AS 64512
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
N x.x.x.0/18 a.a.a.a 100 0 64513 i
V y.y.y.0/19 a.a.a.a 200 0 64513 i
I z.z.z.0/16 a.a.a.a 10 0 64513 i
Displayed 3 routes and 3 total paths
```
New output:
```
frr# show bgp all rpki notfound
For address family: IPv4 Unicast
BGP table version is 0, local router ID is 10.0.0.1, vrf id 0
Default local pref 100, local AS 64512
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
N x.x.x.0/18 a.a.a.a 100 0 64513 i
Donald Sharp [Wed, 30 Aug 2023 14:33:29 +0000 (10:33 -0400)]
ospfd: Prevent use after free( and crash of ospf ) when no router ospf
Consider this config:
router ospf
redistribute kernel
Then you issue:
no router ospf
ospf will crash with a use after free.
The problem is that the event's associated with the
ospf pointer were shut off then the ospf_external_delete
was called which rescheduled the event. Let's just move
event deletion to the end of the no router ospf.
Donatas Abraitis [Sun, 20 Aug 2023 21:01:42 +0000 (00:01 +0300)]
bgpd: Do not explicitly print MAXTTL value for ebgp-multihop vty output
1. Create /etc/frr/frr.conf
```
frr version 7.5
frr defaults traditional
hostname centos8.localdomain
no ip forwarding
no ipv6 forwarding
service integrated-vtysh-config
line vty
router bgp 4250001000
neighbor 192.168.122.207 remote-as 65512
neighbor 192.168.122.207 ebgp-multihop
```
2. Start FRR
`# systemctl start frr
`
3. Show running configuration. Note that FRR explicitly set and shows the default TTL (225)
```
Building configuration...
Current configuration:
!
frr version 7.5
frr defaults traditional
hostname centos8.localdomain
no ip forwarding
no ipv6 forwarding
service integrated-vtysh-config
!
router bgp 4250001000
neighbor 192.168.122.207 remote-as 65512
neighbor 192.168.122.207 ebgp-multihop 255
!
line vty
!
end
```
4. Copy initial frr.conf to frr.conf.new (no changes)
`# cp /etc/frr/frr.conf /root/frr.conf.new
`
5. Run frr-reload.sh:
```
$ /usr/lib/frr/frr-reload.py --test /root/frr.conf.new
2023-08-20 20:15:48,050 INFO: Called via "Namespace(bindir='/usr/bin', confdir='/etc/frr', daemon='', debug=False, filename='/root/frr.conf.new', input=None, log_level='info', overwrite=False, pathspace=None, reload=False, rundir='/var/run/frr', stdout=False, test=True, vty_socket=None)"
2023-08-20 20:15:48,050 INFO: Loading Config object from file /root/frr.conf.new
2023-08-20 20:15:48,124 INFO: Loading Config object from vtysh show running
Lines To Delete
===============
router bgp 4250001000
no neighbor 192.168.122.207 ebgp-multihop 255
Donatas Abraitis [Fri, 18 Aug 2023 08:28:03 +0000 (11:28 +0300)]
bgpd: Make sure we have enough data to read two bytes when validating AIGP
Found when fuzzing:
```
==3470861==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xffff77801ef7 at pc 0xaaaaba7b3dbc bp 0xffffcff0e760 sp 0xffffcff0df50
READ of size 2 at 0xffff77801ef7 thread T0
0 0xaaaaba7b3db8 in __asan_memcpy (/home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgpd+0x363db8) (BuildId: cc710a2356e31c7f4e4a17595b54de82145a6e21)
1 0xaaaaba81a8ac in ptr_get_be16 /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/./lib/stream.h:399:2
2 0xaaaaba819f2c in bgp_attr_aigp_valid /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgp_attr.c:504:3
3 0xaaaaba808c20 in bgp_attr_aigp /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgp_attr.c:3275:7
4 0xaaaaba7ff4e0 in bgp_attr_parse /home/ubuntu/frr_8_5_2/frr_8_5_2_fuzz_clang/bgpd/bgp_attr.c:3678:10
```
Donatas Abraitis [Fri, 11 Aug 2023 15:21:12 +0000 (18:21 +0300)]
vtysh: Print uniq lines when parsing `no service ...`
Before this patch:
```
no service cputime-warning
no service cputime-warning
no ipv6 forwarding
no service cputime-warning
no service cputime-warning
no service cputime-warning
```
bgpd: Fix session reset issue caused by malformed core attributes
RCA:
On encountering any attribute error for core attributes in update message,
the error handling is set to 'treat as withdraw' and
further parsing of the remaining attributes is skipped.
But the stream pointer is not being correctly adjusted to
point to the next NLRI field skipping the rest of the attributes.
This leads to incorrect parsing of the NLRI field,
which causes BGP session to reset.
Fix:
The stream pointer offset is rightly adjusted to point to the NLRI field correctly
when the malformed attribute is encountered and remaining attribute parsing is skipped.
Trey Aspelund [Fri, 17 Feb 2023 21:47:09 +0000 (21:47 +0000)]
lib: skip route-map optimization if !AF_INET(6)
Currently we unconditionally send a prefix through the optimized
route-map codepath if the v4 and v6 LPM tables have been allocated and
optimization has not been disabled.
However prefixes from address-families that are not IPv4/IPv6 unicast
always fail the optimized route-map index lookup, because they occur on
an LPM tree that is IPv4 or IPv6 specific.
e.g.
Even if you have an empty permit route-map clause, Type-3 EVPN routes
are always denied:
```
--config
route-map soo-foo permit 10
--logs
2023/02/17 19:38:42 BGP: [KZK58-6T4Y6] No best match sequence for pfx: [3]:[0]:[32]:[2.2.2.2] in route-map: soo-foo, result: no match
2023/02/17 19:38:42 BGP: [H5AW4-JFYQC] Route-map: soo-foo, prefix: [3]:[0]:[32]:[2.2.2.2], result: deny
```
There is some existing code that creates an AF_INET/AF_INET6 prefix
using the IP/prefix information from a Type-2/5 EVPN route, which
allowed only these two route-types to successfully attempt an LPM lookup
in the route-map optimization trees via the converted prefix.
This commit does 3 things:
1) Reverts to non-optimized route-map lookup for prefixes that are not
AF_INET or AF_INET6.
2) Cleans up the route-map code so that the AF check is part of the
index lookup + the EVPN RT-2/5 -> AF_INET/6 prefix conversion occurs
outside the index lookup.
3) Adds "debug route-map detail" logs to indicate when we attempt to
convert an AF_EVPN prefix into an AF_INET/6 prefix + when we fallback
to a non-optimized lookup.
Additional functionality for optimized lookups of prefixes from other
address-families can be added prior to the index lookup, similar to how
the existing EVPN conversion works today.
New behavior:
```
2023/02/17 21:44:27 BGP: [WYP1M-NE4SY] Converted EVPN prefix [5]:[0]:[32]:[192.0.2.7] into 192.0.2.7/32 for optimized route-map lookup
2023/02/17 21:44:27 BGP: [MT1SJ-WEJQ1] Best match route-map: soo-foo, sequence: 10 for pfx: 192.0.2.7/32, result: match
2023/02/17 21:44:27 BGP: [H5AW4-JFYQC] Route-map: soo-foo, prefix: 192.0.2.7/32, result: permit
2023/02/17 21:44:27 BGP: [WYP1M-NE4SY] Converted EVPN prefix [2]:[0]:[48]:[aa:bb:cc:00:22:22]:[32]:[20.0.0.2] into 20.0.0.2/32 for optimized route-map lookup
2023/02/17 21:44:27 BGP: [MT1SJ-WEJQ1] Best match route-map: soo-foo, sequence: 10 for pfx: 20.0.0.2/32, result: match
2023/02/17 21:44:27 BGP: [H5AW4-JFYQC] Route-map: soo-foo, prefix: 20.0.0.2/32, result: permit
2023/02/17 21:44:27 BGP: [KHG7H-RH4PN] Unable to convert EVPN prefix [3]:[0]:[32]:[2.2.2.2] into IPv4/IPv6 prefix. Falling back to non-optimized route-map lookup
2023/02/17 21:44:27 BGP: [MT1SJ-WEJQ1] Best match route-map: soo-foo, sequence: 10 for pfx: [3]:[0]:[32]:[2.2.2.2], result: match
2023/02/17 21:44:27 BGP: [H5AW4-JFYQC] Route-map: soo-foo, prefix: [3]:[0]:[32]:[2.2.2.2], result: permit
```
bgpd: Do not try to redistribute routes if we are shutting down
When switching `router bgp`, `no router bgp` and doing redistributions, we should
ignore this action, otherwise memory leak happens:
```
Indirect leak of 400 byte(s) in 2 object(s) allocated from:
0 0x7f81b36b3a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153
1 0x7f81b327bd2e in qcalloc lib/memory.c:105
2 0x55f301d28628 in bgp_node_create bgpd/bgp_table.c:92
3 0x7f81b3309d0b in route_node_new lib/table.c:52
4 0x7f81b3309d0b in route_node_set lib/table.c:61
5 0x7f81b330be0a in route_node_get lib/table.c:319
6 0x55f301ce89df in bgp_redistribute_add bgpd/bgp_route.c:8907
7 0x55f301dac182 in zebra_read_route bgpd/bgp_zebra.c:593
8 0x7f81b334dcd7 in zclient_read lib/zclient.c:4179
9 0x7f81b331d702 in event_call lib/event.c:1995
10 0x7f81b325d597 in frr_run lib/libfrr.c:1213
11 0x55f301b94b12 in main bgpd/bgp_main.c:505
12 0x7f81b2b57082 in __libc_start_main ../csu/libc-start.c:308
```
Donald Sharp [Mon, 17 Jul 2023 14:00:32 +0000 (10:00 -0400)]
zebra: Further handle route replace semantics
When an upper level protocol is installing a route X that needs to be
route replaced and at the same time the same or another protocol installs a
different route that depends on route X for nexthop resolution can leave
us with a state where the route is not accepted because zebra is still
really early in the route replace semantics ( route X is still on the work
Queue to be processed ) then the dependent route would not be installed.
This came up in the bgp_default_originate test cases frequently.
Further extendd the ROUTE_ENTR_ROUTE_REPLACING flag to cover this case
as well. This has come up because the early route processing queueing
that was implemented late last year.
Donald Sharp [Fri, 14 Jul 2023 16:14:20 +0000 (12:14 -0400)]
bgpd: Prevent use after free
When running bgp_always_compare_med, I am frequently seeing a crash
After running with valgrind I am seeing this and a invalid write
immediately after this as well.
==311743== Invalid read of size 2
==311743== at 0x4992421: route_map_counter_decrement (routemap.c:3308)
==311743== by 0x35664D: peer_route_map_unset (bgpd.c:7259)
==311743== by 0x306546: peer_route_map_unset_vty (bgp_vty.c:8037)
==311743== by 0x3066AC: no_neighbor_route_map (bgp_vty.c:8081)
==311743== by 0x49078DE: cmd_execute_command_real (command.c:990)
==311743== by 0x4907A63: cmd_execute_command (command.c:1050)
==311743== by 0x490801F: cmd_execute (command.c:1217)
==311743== by 0x49C5535: vty_command (vty.c:551)
==311743== by 0x49C7459: vty_execute (vty.c:1314)
==311743== by 0x49C97D1: vtysh_read (vty.c:2223)
==311743== by 0x49BE5E2: event_call (event.c:1995)
==311743== by 0x494786C: frr_run (libfrr.c:1204)
==311743== by 0x1F7655: main (bgp_main.c:505)
==311743== Address 0x9ec2180 is 64 bytes inside a block of size 120 free'd
==311743== at 0x484B27F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==311743== by 0x495A1BA: qfree (memory.c:130)
==311743== by 0x498D412: route_map_free_map (routemap.c:748)
==311743== by 0x498D176: route_map_add (routemap.c:672)
==311743== by 0x498D79B: route_map_get (routemap.c:857)
==311743== by 0x499C256: lib_route_map_create (routemap_northbound.c:102)
==311743== by 0x49702D8: nb_callback_create (northbound.c:1234)
==311743== by 0x497107F: nb_callback_configuration (northbound.c:1578)
==311743== by 0x4971693: nb_transaction_process (northbound.c:1709)
==311743== by 0x496FCF4: nb_candidate_commit_apply (northbound.c:1103)
==311743== by 0x496FE4E: nb_candidate_commit (northbound.c:1136)
==311743== by 0x497798F: nb_cli_classic_commit (northbound_cli.c:49)
==311743== by 0x4977B4F: nb_cli_pending_commit_check (northbound_cli.c:88)
==311743== by 0x49078C1: cmd_execute_command_real (command.c:987)
==311743== by 0x4907B44: cmd_execute_command (command.c:1068)
==311743== by 0x490801F: cmd_execute (command.c:1217)
==311743== by 0x49C5535: vty_command (vty.c:551)
==311743== by 0x49C7459: vty_execute (vty.c:1314)
==311743== by 0x49C97D1: vtysh_read (vty.c:2223)
==311743== by 0x49BE5E2: event_call (event.c:1995)
==311743== by 0x494786C: frr_run (libfrr.c:1204)
==311743== by 0x1F7655: main (bgp_main.c:505)
==311743== Block was alloc'd at
==311743== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==311743== by 0x495A068: qcalloc (memory.c:105)
==311743== by 0x498D0C8: route_map_new (routemap.c:646)
==311743== by 0x498D128: route_map_add (routemap.c:658)
==311743== by 0x498D79B: route_map_get (routemap.c:857)
==311743== by 0x499C256: lib_route_map_create (routemap_northbound.c:102)
==311743== by 0x49702D8: nb_callback_create (northbound.c:1234)
==311743== by 0x497107F: nb_callback_configuration (northbound.c:1578)
==311743== by 0x4971693: nb_transaction_process (northbound.c:1709)
==311743== by 0x496FCF4: nb_candidate_commit_apply (northbound.c:1103)
==311743== by 0x496FE4E: nb_candidate_commit (northbound.c:1136)
==311743== by 0x497798F: nb_cli_classic_commit (northbound_cli.c:49)
==311743== by 0x4977B4F: nb_cli_pending_commit_check (northbound_cli.c:88)
==311743== by 0x49078C1: cmd_execute_command_real (command.c:987)
==311743== by 0x4907B44: cmd_execute_command (command.c:1068)
==311743== by 0x490801F: cmd_execute (command.c:1217)
==311743== by 0x49C5535: vty_command (vty.c:551)
==311743== by 0x49C7459: vty_execute (vty.c:1314)
==311743== by 0x49C97D1: vtysh_read (vty.c:2223)
==311743== by 0x49BE5E2: event_call (event.c:1995)
==311743== by 0x494786C: frr_run (libfrr.c:1204)
Effectively the route_map that is being stored has been freed already
but we have not cleaned up properly yet. Go through and clean the
code up by ensuring that the pointer actually exists instead of trusting
it does when doing the decrement operation.
Donatas Abraitis [Mon, 27 Feb 2023 21:40:32 +0000 (23:40 +0200)]
bgpd: Intern attributes before putting into rib-out
After we call subgroup_announce_check(), we leave communities, large-communities
that are modified by route-maps uninterned, and here we have a memory leak.
```
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323:Direct leak of 80 byte(s) in 2 object(s) allocated from:
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #0 0x7f0858d90037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #1 0x7f08589b15b2 in qcalloc lib/memory.c:105
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #2 0x561f5c4e08d2 in lcommunity_new bgpd/bgp_lcommunity.c:28
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #3 0x561f5c4e11d9 in lcommunity_dup bgpd/bgp_lcommunity.c:141
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #4 0x561f5c5c3b8b in route_set_lcommunity bgpd/bgp_routemap.c:2491
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #5 0x7f0858a177a5 in route_map_apply_ext lib/routemap.c:2675
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #6 0x561f5c5696f9 in subgroup_announce_check bgpd/bgp_route.c:2352
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #7 0x561f5c5fb728 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:682
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #8 0x561f5c5fbd95 in subgroup_announce_route bgpd/bgp_updgrp_adv.c:765
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #9 0x561f5c5f6105 in peer_af_announce_route bgpd/bgp_updgrp.c:2187
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #10 0x561f5c5790be in bgp_announce_route_timer_expired bgpd/bgp_route.c:5032
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #11 0x7f0858a76e4e in thread_call lib/thread.c:1991
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #12 0x7f0858974c24 in frr_run lib/libfrr.c:1185
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #13 0x561f5c3e955d in main bgpd/bgp_main.c:505
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #14 0x7f08583a9d09 in __libc_start_main ../csu/libc-start.c:308
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323:Indirect leak of 144 byte(s) in 2 object(s) allocated from:
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #0 0x7f0858d8fe8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #1 0x7f08589b1579 in qmalloc lib/memory.c:100
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #2 0x561f5c4e1282 in lcommunity_dup bgpd/bgp_lcommunity.c:144
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #3 0x561f5c5c3b8b in route_set_lcommunity bgpd/bgp_routemap.c:2491
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #4 0x7f0858a177a5 in route_map_apply_ext lib/routemap.c:2675
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #5 0x561f5c5696f9 in subgroup_announce_check bgpd/bgp_route.c:2352
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #6 0x561f5c5fb728 in subgroup_announce_table bgpd/bgp_updgrp_adv.c:682
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #7 0x561f5c5fbd95 in subgroup_announce_route bgpd/bgp_updgrp_adv.c:765
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #8 0x561f5c5f6105 in peer_af_announce_route bgpd/bgp_updgrp.c:2187
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #9 0x561f5c5790be in bgp_announce_route_timer_expired bgpd/bgp_route.c:5032
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #10 0x7f0858a76e4e in thread_call lib/thread.c:1991
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #11 0x7f0858974c24 in frr_run lib/libfrr.c:1185
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #12 0x561f5c3e955d in main bgpd/bgp_main.c:505
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323- #13 0x7f08583a9d09 in __libc_start_main ../csu/libc-start.c:308
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-
./bgp_large_community.test_bgp_large_community_topo_2/r1.bgpd.asan.2465323-SUMMARY: AddressSanitizer: 224 byte(s) leaked in 4 allocation(s).
```
Direct leak of 80 byte(s) in 2 object(s) allocated from:
0 0x7f8065294d28 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xded28)
1 0x7f8064cfd216 in qcalloc lib/memory.c:105
2 0x5646b7024073 in ecommunity_dup bgpd/bgp_ecommunity.c:252
3 0x5646b7153585 in route_set_ecommunity_lb bgpd/bgp_routemap.c:2925
4 0x7f8064d459be in route_map_apply_ext lib/routemap.c:2675
5 0x5646b7116584 in subgroup_announce_check bgpd/bgp_route.c:2374
6 0x5646b711b907 in subgroup_process_announce_selected bgpd/bgp_route.c:2918
7 0x5646b717ceb8 in group_announce_route_walkcb bgpd/bgp_updgrp_adv.c:184
8 0x7f8064cc6acd in hash_walk lib/hash.c:270
9 0x5646b717ae0c in update_group_af_walk bgpd/bgp_updgrp.c:2046
10 0x5646b7181275 in group_announce_route bgpd/bgp_updgrp_adv.c:1030
11 0x5646b711a986 in bgp_process_main_one bgpd/bgp_route.c:3303
12 0x5646b711b5bf in bgp_process_wq bgpd/bgp_route.c:3444
13 0x7f8064da12d7 in work_queue_run lib/workqueue.c:267
14 0x7f8064d891fb in thread_call lib/thread.c:1991
15 0x7f8064cdffcf in frr_run lib/libfrr.c:1185
16 0x5646b6feca67 in main bgpd/bgp_main.c:505
17 0x7f8063f96c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
Indirect leak of 16 byte(s) in 2 object(s) allocated from:
0 0x7f8065294b40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40)
1 0x7f8064cfcf01 in qmalloc lib/memory.c:100
2 0x5646b7024151 in ecommunity_dup bgpd/bgp_ecommunity.c:256
3 0x5646b7153585 in route_set_ecommunity_lb bgpd/bgp_routemap.c:2925
4 0x7f8064d459be in route_map_apply_ext lib/routemap.c:2675
5 0x5646b7116584 in subgroup_announce_check bgpd/bgp_route.c:2374
6 0x5646b711b907 in subgroup_process_announce_selected bgpd/bgp_route.c:2918
7 0x5646b717ceb8 in group_announce_route_walkcb bgpd/bgp_updgrp_adv.c:184
8 0x7f8064cc6acd in hash_walk lib/hash.c:270
9 0x5646b717ae0c in update_group_af_walk bgpd/bgp_updgrp.c:2046
10 0x5646b7181275 in group_announce_route bgpd/bgp_updgrp_adv.c:1030
11 0x5646b711a986 in bgp_process_main_one bgpd/bgp_route.c:3303
12 0x5646b711b5bf in bgp_process_wq bgpd/bgp_route.c:3444
13 0x7f8064da12d7 in work_queue_run lib/workqueue.c:267
14 0x7f8064d891fb in thread_call lib/thread.c:1991
15 0x7f8064cdffcf in frr_run lib/libfrr.c:1185
16 0x5646b6feca67 in main bgpd/bgp_main.c:505
17 0x7f8063f96c86 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21c86)
```