Martin Winter [Fri, 17 Jan 2020 16:18:19 +0000 (17:18 +0100)]
FRRouting Release 7.2.1
(Maintenance Release)
- BGPd
- Fix Addpath issue
- Do not apply eBGP policy for iBGP peers
- Show `ip` and `fqdn` in json output for `show [ip] bgp <route> json`
- Fix large route-distinguisher's format
- Fix `no bgp listen range ...` configuration command
- Autocomplete neighbor for clear bgp
- Reflect the distance in RIB when it is changed for an arbitrary afi/safi
- Notify "Peer De-configured" after entering 'no neighbor <neighbor> cmd
- Fix per afi/safi addpath peer counting
- Rework BGP dampening to be per AFI/SAFI
- Do not send next-hop as :: in MP_REACH_NLRI if no link-local exists
- Override peer's TTL only if peer-group is configured with TTL
- Remove error message for unkown afi/safi combination
- Keep the session down if maximum-prefix is reached
- OSPFd
- Fix BFD down not tearing down OSPF adjacency for point-to-point net
- BFDd
- Fix multiple VRF handling
- VRF security improvement
- PIMd
- Fix rp crash
- NHRPd
- Make sure `no ip nhrp map <something>` works as expected
- LDPd
- Add missing sanity check in the parsing of label messages
- Zebra
- Use correct state when installing evpn macs
- Capture dplane plugin flags
- lib
- Fix interface config when vrf changes
- Fix Interface Infinite Loop Walk (for special interfaces such as bond)
- snapcraft
- fix missing vrrpd daemon
- Others
- Rename man pages (to avoid conflicts with other packages)
- Various other fixes for code cleanup and memory leaks
Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
Quentin Young [Sun, 22 Dec 2019 01:19:47 +0000 (20:19 -0500)]
pimd: readd iph length checks
Kernel might not hand us a bad packet, but better safe than sorry here.
Validate the IP header length field. Also adds an additional check that
the packet length is sufficient for an IGMP packet, and a check that we
actually have enough for an ip header at all.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Tue, 3 Dec 2019 20:48:27 +0000 (15:48 -0500)]
bgpd: more attribute parsing cleanup & paranoia
* Move VNC interning to the appropriate spot
* Use existing bgp_attr_flush_encap to free encap sets
* Assert that refcounts are correct before exiting to keep the demons
contained in their fiery prison
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Tue, 26 Nov 2019 19:42:40 +0000 (14:42 -0500)]
bgpd: clean up attribute parsing state before ret
Early exits without appropriate cleanup were causing obscure double
frees and other issues later on in the attribute parsing code. If we
return anything except a hard attribute parse error, we have cleanup and
refcounts to manage.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Thu, 21 Nov 2019 23:55:59 +0000 (18:55 -0500)]
bgpd: fix heap buffer overflow in lcom -> str enc
Spaces were not being accounted for in the heap buffer sizing, leading
to a heap buffer overflow when encoding large communities to their
string representations.
This patch also uses safer functions to do the encoding instead of
pointer math.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Mitchell Skiba [Thu, 9 Jan 2020 19:46:13 +0000 (11:46 -0800)]
bgpd: add addpath ID to adj_out tree sort
When withdrawing addpaths, adj_lookup was called to find the path that
needed to be withdrawn. It would lookup in the RB tree based on subgroup
pointer alone, often find the path with the wrong addpath ID, and return
null. Only the path highest in the tree sent to the subgroup could be
found, thus withdrawn.
Adding the addpath ID to the sort criteria for the RB tree allows us to
simplify the logic for adj_lookup, and address this problem. We are able
to remove the logic around non-addpath subgroups because the addpath ID
is consistently 0 for non-addpath adj_outs, so special logic to skip
matching the addpath ID isn't required. (As a side note, addpath will
also never use ID 0, so there won't be any ambiguity when looking at the
structure content.)
Quentin Young [Wed, 15 Jan 2020 18:00:34 +0000 (13:00 -0500)]
bgpd: fix memory leak when parsing capabilities
Duplicated domain name capability messages cause memory leak. The amount
of leaked memory is proportional to the size of the duplicated
capabilities. This bug was introduced in 2015.
To hit this, a BGP OPEN message must contain multiple FQDN capabilities.
Memory is leaked when the hostname portion of the capability is of
length 0, but the domainname portion is not, for any of the duplicated
capabilities beyond the first one.
Quentin Young [Sat, 4 Jan 2020 02:22:44 +0000 (21:22 -0500)]
zebra: reject ingress packets that are too large
There may be logic to prevent this ever happening earlier in the network
read path, but it doesn't hurt to double check it here, because clearly
deeper paths rely on this being the case.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Sat, 4 Jan 2020 02:18:49 +0000 (21:18 -0500)]
zebra: fix multiple bfd buffer issues
Whatever this BFD re-transmission function is had a few problems.
1. Used memcpy instead of the (more concise) stream APIs, which include
bounds checking.
2. Did not sufficiently check packet sizes.
Actually, 2) is mitigated but is still a problem, because the BFD header
is 2 bytes larger than the "normal" ZAPI header, while the overall
message size remains the same. So if the source message being duplicated
is actually right up against the ZAPI_MAX_PACKET_SIZ, you still can't
fit the whole message into your duplicated message. I have no idea what
the intent was here but at least there's a warning if it happens now.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Sat, 4 Jan 2020 03:28:53 +0000 (22:28 -0500)]
zebra: fix iptable memleak, fix free funcs
- Fix iptable freeing code to free malloc'd list
- malloc iptable in zapi handler and use those functions to free it when
done to fix a linked list memleak
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Fri, 3 Jan 2020 07:12:58 +0000 (02:12 -0500)]
zebra: check pbr rule msg for correct afi
further down we hash the src & dst ip, which asserts that the afi is one
of the well known ones, given the field names i assume the correct afis
here are af_inet[6]
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
This commit is about #5629 's issue.
Before this commit, bgpd creates format string of
bgp-route-distinguisher as int32, but correctly format
is uint32. current bgpd's sh-run-cli generate int32 rd,
so if user sets the rd as 1:4294967295(0x1:0xffffffff),
sh-run cli generates 1: -1 as running-config. This
commit fix that issue.
Donatas Abraitis [Thu, 19 Dec 2019 20:09:47 +0000 (22:09 +0200)]
bgpd: Make sure we can use `no bgp listen range ...`
Fixes:
```
exit1-debian-9(config-router)# no bgp listen range 192.168.10.0/24 peer-group TEST
% Peer-group does not exist
exit1-debian-9(config-router)#
```
Closes https://github.com/FRRouting/frr/issues/5570
Donald Sharp [Mon, 2 Dec 2019 14:37:47 +0000 (09:37 -0500)]
bgpd: Prevent crash in bgp_table_range_lookup
The function bgp_table_range_lookup attempts to walk down
the table node data structures to find a list of matching
nodes. We need to guard against the current node from
not matching and not having anything in the child nodes.
Add a bit of code to guard against this.
Traceback that lead me down this path:
Nov 24 12:22:38 frr bgpd[20257]: Received signal 11 at 1574616158 (si_addr 0x2, PC 0x46cdc3); aborting...
Nov 24 12:22:38 frr bgpd[20257]: Backtrace for 11 stack frames:
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_backtrace_sigsafe+0x67) [0x7fd1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(zlog_signal+0x113) [0x7fd1ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(+0x70e65) [0x7fd1ad465e65]ad445db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libpthread.so.0(+0xf5f0) [0x7fd1abd605f0]45db3]1ad445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(bgp_table_range_lookup+0x63) [0x46cdc3]445957]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib64/frr/modules/bgpd_rpki.so(+0x4f0d) [0x7fd1a934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(thread_call+0x60) [0x7fd1ad4736e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libfrr.so.0(frr_run+0x128) [0x7fd1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd(main+0x2e3) [0x41c043]1ad443ab8]e0]934ff0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: /usr/lib/frr/bgpd() [0x41d9bb]main+0xf5) [0x7fd1ab9a5505]f0d]57]
Nov 24 12:22:38 frr bgpd[20257]: in thread bgpd_sync_callback scheduled from bgpd/bgp_rpki.c:351#012; aborting...
Nov 24 12:22:38 frr watchfrr[6779]: [EC 268435457] bgpd state -> down : read returned EOF
Nov 24 12:22:38 frr zebra[5952]: [EC 4043309116] Client 'bgp' encountered an error and is shutting down.
Nov 24 12:22:38 frr zebra[5952]: zebra/zebra_ptm.c:1345 failed to find process pid registration
Nov 24 12:22:38 frr zebra[5952]: client 15 disconnected. 0 bgp routes removed from the rib
I am not really 100% sure what we are really trying to do with this function, but we must
guard against child nodes not having any data.
Fixes: #5440 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Sun, 1 Dec 2019 14:29:32 +0000 (09:29 -0500)]
bgpd: Fix memory leak in json output of show commands
When dumping a large bit of table data via bgp_show_table
and if there is no information to display for a particular
`struct bgp_node *` the data allocated via json_object_new_array()
is leaked. Not a big deal on small tables but if you have a full
bgp feed and issue a show command that does not match any of
the route nodes ( say `vtysh -c "show bgp ipv4 large-community-list FOO"`)
then we will leak memory.
Before code change and issuing the above show bgp large-community-list command 15-20 times:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: > 2GB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: > 2GB
Free small blocks: 31 MiB
Free ordinary blocks: 616 KiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
After:
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 924 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 558 MiB
Free small blocks: 26 MiB
Free ordinary blocks: 340 MiB
Ordinary blocks: 0
Small blocks: 0
Holding blocks: 0
Please note the 340mb of free ordinary blocks is from the fact I issued a
`show bgp ipv4 uni json` command and generated a large amount of data.
Fixes: #5445 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donatas Abraitis [Thu, 31 Oct 2019 07:53:18 +0000 (09:53 +0200)]
bgpd: Reflect the distance in RIB when it is changed for an arbitrary afi/safi
debian-9# show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [20/0] via 192.168.0.1, eth1, 00:15:22
debian-9# conf
debian-9(config)# router bgp 100
debian-9(config-router)# address-family ipv4
debian-9(config-router-af)# distance bgp 123 123 123
debian-9(config-router-af)# do show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [123/0] via 192.168.0.1, eth1, 00:00:09
debian-9(config-router-af)# no distance bgp
debian-9(config-router-af)# do show ip route 192.168.255.2/32 longer-prefixes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route, q - queued route, r - rejected route
B>* 192.168.255.2/32 [20/0] via 192.168.0.1, eth1, 00:00:02
debian-9(config-router-af)#
Donald Sharp [Wed, 20 Nov 2019 00:36:19 +0000 (19:36 -0500)]
pimd: Various buffer overflow reads and crashes
A variety of buffer overflow reads and crashes
that could occur if you fed bad info into pim.
1) When type is setup incorrectly we were printing the first 8 bytes
of the pim_parse_addr_source, but the min encoding length is
4 bytes. As such we will read beyond end of buffer.
2) The RP(pim, grp) macro can return a NULL value
Do not automatically assume that we can deref
the data.
3) BSM parsing was not properly sanitizing data input from wire
and we could enter into situations where we would read beyond
the end of the buffer. Prevent this from happening, we are
probably left in a bad way.
4) The received bit length cannot be greater than 32 bits,
refuse to allow it to happen.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 19 Nov 2019 13:22:50 +0000 (08:22 -0500)]
pimd: Fix possible read beyond end of data received
If a register packet is received that is less than the PIM_MSG_REGISTER_LEN
in size we can have a possible situation where the data being
checksummed is just random data from the buffer we read into.
2019/11/18 21:45:46 warnings: PIM: int pim_if_add_vif(struct interface *, _Bool, _Bool): could not get address for interface fuzziface ifindex=0
==27636== Invalid read of size 4
==27636== at 0x4E6EB0D: in_cksum (checksum.c:28)
==27636== by 0x4463CC: pim_pim_packet (pim_pim.c:194)
==27636== by 0x40E2B4: main (pim_main.c:117)
==27636== Address 0x771f818 is 0 bytes after a block of size 24 alloc'd
==27636== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27636== by 0x40E261: main (pim_main.c:112)
==27636==
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 19 Nov 2019 20:46:42 +0000 (15:46 -0500)]
zebra: Router Advertisement socket mess up
The code for when a new vrf is created to properly handle
router advertisement for it is messed up in several ways:
1) Generation of the zrouter data structure should set the rtadv
socket to -1 so that we don't accidently close someone elses
open file descriptor
2) When you created a new zvrf instance *after* bootup we are XCALLOC'ing
the data structure so the zvrf->fd was 0. The shutdown code was looking
for the >= 0 to know if the fd existed (since fd 0 is valid!)
This sequence of events would cause zebra to consume 100% of the
cpu:
Run zebra by itself ( no other programs )
ip link add vrf1 type vrf table 1003
ip link del vrf vrf1
vtysh -c "configure" -c "no interface vrf1"
This commit fixes this issue.
Fixes: #5376 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Mitch Skiba [Thu, 14 Nov 2019 19:28:23 +0000 (19:28 +0000)]
bgpd: Fix per afi/safi addpath peer counting
The total_peercount table was created as a short cut for queries about
if addpath was enabled at all on a particular afi/safi. However, the
values weren't updated, so BGP would act as if addpath wasn't enabled
when determining if updates should be sent out. The error in behavior
was much more noticeable in tx-all than best-per-as, since changes in
what is sent by best-per-as would often trigger updates even if addpath
wasn't enabled.
Donald Sharp [Mon, 18 Nov 2019 16:43:52 +0000 (11:43 -0500)]
pimd: Create pimreg interface when we start any interface config
When you configure interface configuration without explicitly
configuring pim on that interface, we were not creating the pimreg
interface and as such we would crash in an attempted register
since the pimreg device is non-existent.
The crash is this:
==8823== Invalid read of size 8
==8823== at 0x468614: pim_channel_add_oif (pim_oil.c:392)
==8823== by 0x46D0F1: pim_register_join (pim_register.c:61)
==8823== by 0x449AB3: pim_mroute_msg_nocache (pim_mroute.c:242)
==8823== by 0x449AB3: pim_mroute_msg (pim_mroute.c:661)
==8823== by 0x449AB3: mroute_read (pim_mroute.c:707)
==8823== by 0x4FC0676: thread_call (thread.c:1549)
==8823== by 0x4EF3A2F: frr_run (libfrr.c:1064)
==8823== by 0x40DCB5: main (pim_main.c:162)
==8823== Address 0xc8 is not stack'd, malloc'd or (recently) free'd
exit2-debian-9# show ip bgp ipv4 unicast dampening parameters
Half-life time: 1 min
Reuse penalty: 2
Suppress penalty: 3
Max suppress time: 4 min
Max suppress penalty: 32
exit2-debian-9# show ip bgp ipv4 multicast dampening parameters
Half-life time: 5 min
Reuse penalty: 6
Suppress penalty: 7
Max suppress time: 8 min
Max suppress penalty: 18
ospf: BFD down not tearing down OSPF adjacency for point-to-point network
Root Cause:
Lookup for the point-to-point neighbor was failing because the neighbor
lookup was based on neighbor interface IP address. But, for point-to-point
neighbor the key is router-id for lookup. Lookup failure was causing the
BFD updates from PTM to get dropped.
Fix:
Added walk of the neighbor list if the network type is point-to-point to
find the appropriate neighbor. The match is based on source IP address of
the neighbor since that’s the address registered with BFD for monitoring.