Olivier Dugeon [Wed, 10 Feb 2021 18:20:24 +0000 (19:20 +0100)]
ospfd: Debug race condition in Segment Routing
Issue #7926 hilight a race condition in Segment Routing processing.
The problem occurs when Router Information Opaque LSA is received late, in
particular after SPF run and after ospf_sr_nhlfe_update() was called. This
scenario is unfrequent and takes place due to a slow DR election.
In this particular case, SR Prefix are handle but not fully fill. In fact,
SRGB for the nexthop is not yet received and thus, output label could not
be computed.
When Router Information Opaque LSA is received and processed, if the
corresponding SR node is a direct neighbor of the self node, update_out_nhlfe()
is called against all SR nodes to adjust SR prefix if the next hop is the new
SR node. The function wrongly computes output label and configure a bad MPLS
LFIB entries.
Another way to hilight the problem is to change through CLI the SRGB of a node
and look to MPLS LFIB of direct neighbor, in particular those who announce
EXPLICIT NULL Prefix SID.
This patch correct the update_out_nhlfe() function by calling the appropriate
function (sr_prefix_out_label() instead of index2label()) to compute the output
label.
Some log debugs were adjusted and unused prefix route table was removed too.
Igor Ryzhov [Tue, 9 Feb 2021 18:38:45 +0000 (21:38 +0300)]
vrf: mark vrf as configured when entering vrf node
The VRF must be marked as configured when user enters "vrf NAME" command.
Otherwise, the following problem occurs:
`ip link add red type vrf table 1`
VRF structure is allocated.
`vtysh -c "conf t" -c "vrf red"`
`lib_vrf_create` is called, and pointer to the VRF structure is stored
to the nb_config_entry.
`ip link del red`
VRF structure is freed (because it is not marked as configured), but
the pointer is still stored in the nb_config_entry.
`vtysh -c "conf t" -c "no vrf red"`
Nothing happens, because VRF structure doesn't exist. It means that
`lib_vrf_destroy` is not called, and nb_config_entry still exists in
the running config with incorrect pointer.
`ip link add red type vrf table 1`
New VRF structure is allocated.
`vtysh -c "conf t" -c "vrf red"`
`lib_vrf_create` is NOT called, because the nb_config_entry for that
VRF name still exists in the running config.
After that all NB commands for this VRF will use incorrect pointer to
the freed VRF structure.
zyxwvu Shi [Mon, 8 Feb 2021 12:09:02 +0000 (20:09 +0800)]
bgpd: Do not compare attr again.
`same_attr` has been computed and `hook_call(bgp_process)` (calling
BMP module) would not change it. We could reuse the value to filter
same attribute updates, avoiding an extra comparison.
Quentin Young [Mon, 8 Feb 2021 01:15:24 +0000 (20:15 -0500)]
vtysh: disable bracketed paste in readline
GNU Readline 8.1 enables bracketed paste by default. This results in
newlines not ending the readline() call, which breaks the ability of
users to paste in configs to vtysh's interactive shell.
Disable bracketed paste.
Signed-off-by: Quentin Young <qlyoung@qlyoung.net>
Donald Sharp [Sun, 7 Feb 2021 20:03:51 +0000 (15:03 -0500)]
bfdd: Prevent use after free ( again )
Valgrind is still reporting:
466020-==466020== by 0x11B9F4: main (bfdd.c:403)
466020-==466020== Address 0x5a7d544 is 84 bytes inside a block of size 272 free'd
466020:==466020== at 0x48399AB: free (vg_replace_malloc.c:538)
466020-==466020== by 0x490A947: qfree (memory.c:140)
466020-==466020== by 0x48F2AE8: if_delete (if.c:322)
466020-==466020== by 0x48F250D: if_destroy_via_zapi (if.c:195)
466020-==466020== by 0x497071E: zclient_interface_delete (zclient.c:2040)
466020-==466020== by 0x49745F6: zclient_read (zclient.c:3687)
466020-==466020== by 0x4955AEC: thread_call (thread.c:1684)
466020-==466020== by 0x48FF64E: frr_run (libfrr.c:1126)
466020-==466020== by 0x11B9F4: main (bfdd.c:403)
466020-==466020== Block was alloc'd at
466020:==466020== at 0x483AB65: calloc (vg_replace_malloc.c:760)
466020-==466020== by 0x490A805: qcalloc (memory.c:115)
466020-==466020== by 0x48F23D6: if_new (if.c:160)
466020-==466020== by 0x48F257F: if_create_name (if.c:214)
466020-==466020== by 0x48F3493: if_get_by_name (if.c:558)
466020-==466020== by 0x49705F2: zclient_interface_add (zclient.c:1989)
466020-==466020== by 0x49745E0: zclient_read (zclient.c:3684)
466020-==466020== by 0x4955AEC: thread_call (thread.c:1684)
466020-==466020== by 0x48FF64E: frr_run (libfrr.c:1126)
466020-==466020== by 0x11B9F4: main (bfdd.c:403)
Apparently the bs->ifp pointer is being set even in cases when
the bs->key.ifname is not being set. So go through and just
match the interface pointer and cut-to-the-chase.
Donald Sharp [Sun, 7 Feb 2021 19:59:53 +0000 (14:59 -0500)]
*: Fix usage of bfd_adj_event
Valgrind reports:
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A090D: bgp_bfd_dest_update (bgp_bfd.c:416)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
469901-==469901==
469901-==469901== Conditional jump or move depends on uninitialised value(s)
469901:==469901== at 0x3A093C: bgp_bfd_dest_update (bgp_bfd.c:421)
469901-==469901== by 0x497469E: zclient_read (zclient.c:3701)
469901-==469901== by 0x4955AEC: thread_call (thread.c:1684)
469901-==469901== by 0x48FF64E: frr_run (libfrr.c:1126)
469901-==469901== by 0x213AB3: main (bgp_main.c:540)
469901-==469901== Uninitialised value was created by a stack allocation
469901:==469901== at 0x3A0725: bgp_bfd_dest_update (bgp_bfd.c:376)
On looking at bgp_bfd_dest_update the function call into bfd_get_peer_info
when it fails to lookup the ifindex ifp pointer just returns leaving
the dest and src prefix pointers pointing to whatever was passed in.
Let's do two things:
a) The src pointer was sometimes assumed to be passed in and sometimes not.
Forget that. Make it always be passed in
b) memset the src and dst pointers to be all zeros. Then when we look
at either of the pointers we are not making decisions based upon random
data in the pointers.
bgpd: Dump BGP attrs to check what's the actual prefix with aggr_as 0
Just for more debug information regarding malformed aggregator_as.
```
bgpd[5589]: [EC 33554434] 192.168.10.25: AGGREGATOR AS number is 0 for aspath: 65030
bgpd[5589]: bgp_attr_aggregator: attributes: nexthop 192.168.10.25, origin i, path 65030
```
Mark Stapp [Wed, 3 Feb 2021 21:28:18 +0000 (16:28 -0500)]
tests: fix ospf6_topo1 missing ref files
Only one of the four reference files was present; add the missing
three. The test just silently passed if a ref file was missing:
change that to a failure.
Wesley Coakley [Tue, 2 Feb 2021 23:16:51 +0000 (18:16 -0500)]
doc: cross compilation guide
Wrote a little guide for cross-compiling FRR, gleaned from notes I took
while compiling for a RPi 3B+ on a Gentoo x86_64 system.
Care was taken to keep this documentation as generic as possible so
these steps could be applied to any cross-compile targeting a supported
architecture.
bgpd: Drop aggregator_as attribute if malformed in case of BGP_AS_ZERO
An UPDATE message that contains the AS number of zero in the AS_PATH
or AGGREGATOR attribute MUST be considered as malformed and be
handled by the procedures specified in [RFC7606].
An UPDATE message with a malformed AGGREGATOR attribute SHALL be
handled using the approach of "attribute discard".
Attribute discard: In this approach, the malformed attribute MUST
be discarded and the UPDATE message continues to be processed.
This approach MUST NOT be used except in the case of an attribute
that has no effect on route selection or installation.
Pat Ruddy [Mon, 1 Jun 2020 13:33:30 +0000 (14:33 +0100)]
zebra: resolve multiple functions for local MAC delete
the old VXLAN function for local MAC deletion was still in
existence and being called from the VXLAN code whilst the new
generic function was not being called at all. Resolve this so
the generic function matches the old function and is called
exclusively.
David Lamparter [Tue, 2 Feb 2021 20:05:50 +0000 (21:05 +0100)]
lib/xref: fix frrtrace() calls in thread code
This didn't exist yet when the xref code came around, and since
frrtrace() gets collapsed to nothing by the preprocessor when
tracepoints are disabled, it didn't cause any compiler errors...
Signed-off-by: David Lamparter <equinox@diac24.net>
David Lamparter [Tue, 2 Feb 2021 18:38:38 +0000 (19:38 +0100)]
lib/xref: work around GCC bug 41091
gcc fucks up global variables with section attributes when they're used
in templated C++ code. The template instantiation "magic" kinda breaks
down (it's implemented through COMDAT in the linker, which clashes with
the section attribute.)
The workaround provides full runtime functionality, but the xref
extraction tool (xrelfo.py) won't work on C++ code compiled by GCC.
FWIW, clang gets this right.
Signed-off-by: David Lamparter <equinox@diac24.net>
bgpd: config connect timer is not applied immediately for peers in non-established state.
Description:
When user is config connect timer, it doesn't reflect
immediately. It reflect when next time neighbor is tried to reconnect.
Problem Description/Summary :
When user is config connect timer, it doesn't reflect
The network connection was aborted by the local system.d to reconnect.
Fix is to update the connect timer immediately if BGP
session is not in establish state.
Expected Behavior :
If neighbor is not yet established, we should immediately apply the config connect timer to the peer.
Pat Ruddy [Tue, 3 Nov 2020 17:05:24 +0000 (17:05 +0000)]
bgpd: update traps to RFC4273 Notifications
The old bgpTraps group was obsolteted by RFC4273 and the
bgpNotifications groups was introduces. The new notifications
mirror the bgpTraps except that an extra item peerRemoteAddr
is sent in the notification. This upgrades the support to
conform with RFC4273
Pat Ruddy [Wed, 21 Oct 2020 17:22:06 +0000 (18:22 +0100)]
lib: allow traps with differently indexed objects
The function smux_trap only allows the paaasin of one index which is
applied to all indexed objects. However there is a requirement for
differently indexed objects within a singe trap. This commit
introduces a new function smux_trap_multi_index which can be called
with an array of indices. If this array is onf length 1 the original
smux_trap behaviour is maintained. smux_trap now calls the new
function with and index array length of 1 to avoid changes to
existing callers.
Pat Ruddy [Fri, 18 Sep 2020 09:12:08 +0000 (10:12 +0100)]
bgpd: add utility to check if a vrf is active
From RFC4382:
A VRF is
up(1) when there is at least one interface associated
with the VRF whose ifOperStatus is up(1). A VRF is
down(2) when:
a. There does not exist at least one interface whose
ifOperStatus is up(1).
b. There are no interfaces associated with the VRF.
Run through interfaces associated with a vrf and return
true if there is one in the up state.
lynne [Sat, 30 Jan 2021 00:36:22 +0000 (19:36 -0500)]
isisd: When adjacencies go up and down add support to modify attached-bit
When adjacencies change state the attached-bits in LSPs in other areas
on the router may need to be modified.
1. If a router no longer has a L2 adjacency to another area the
attached-bit must no longer be sent in the LSP
2. If a new L2 adjacency comes up in a different area then the
attached-bit should be sent in the LSP
Stephen Worley [Mon, 11 Jan 2021 22:30:21 +0000 (17:30 -0500)]
zebra: move pbr hash create after update release
Move the pbr hash creation to be after the update release
and dplane install. Now that rules are installed in a separate
dplane pthread, we can have scenarios where we have an interface
flapping and we install/remove rules sufficiently fast enough we
could issue what we think is an update for an identical rule and
end up releasing the rule right after we created it and sent it to
the dplane. This solves the problem of recving duplicate rules
during interface flapping.
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Thu, 7 Jan 2021 20:28:28 +0000 (15:28 -0500)]
pbrd: nht only handle if updates if IPV*_IFINDEX nh
Only handle an interface update in the nexthop tracking code
if the nexthop in question was set with an interface to point
out of. If the nexthop is GW only, the interface update could
be unrelated but have overlapping address space. Let that be
handled elsewhere.
Ex)
```
5.5.5.0/30 dev dummyDoof proto kernel scope link src 5.5.5.1
5.5.5.0/24 dev goofDummy proto kernel scope link src 5.5.5.1
[root@alfred frr-2]# ip ro show table 10000
default via 5.5.5.2 dev dummyDoof proto pbr metric 20
[root@alfred frr-2]# ip link set goofDummy down
[root@alfred frr-2]# ip ro show table 10000
[root@alfred frr-2]# ip link set goofDummy up
[root@alfred frr-2]# ip ro show table 10000
```
Signed-off-by: Stephen Worley <sworley@nvidia.com>
Stephen Worley [Wed, 27 Jan 2021 21:20:22 +0000 (16:20 -0500)]
zebra: disallow resolution to duplicate nexthops
Disallow the resolution to nexthops that are marked duplicate.
When we are resolving to an ecmp group, it's possible this
group has duplicates.
I found this when I hit a bug where we can have groups resolving
to each other and cause the resolved->next->next pointer to increase
exponentially. Sufficiently large ecmp and zebra will grind to a hault.
Like so:
```
D> 4.4.4.14/32 [150/0] via 1.1.1.1 (recursive), weight 1, 00:00:02
* via 1.1.1.1, dummy1 onlink, weight 1, 00:00:02
via 4.4.4.1 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.2 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.3 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.4 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.5 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.6 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.7 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.8 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.9 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.10 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.11 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.12 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.13 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.15 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1 onlink, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1 onlink, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 4.4.4.16 (recursive), weight 1, 00:00:02
via 1.1.1.1, dummy1 onlink, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
via 1.1.1.1, dummy1, weight 1, 00:00:02
D> 4.4.4.15/32 [150/0] via 1.1.1.1 (recursive), weight 1, 00:00:09
* via 1.1.1.1, dummy1 onlink, weight 1, 00:00:09
via 4.4.4.1 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.2 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.3 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.4 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.5 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.6 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.7 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.8 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.9 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.10 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.11 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.12 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.13 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.14 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 4.4.4.16 (recursive), weight 1, 00:00:09
via 1.1.1.1, dummy1 onlink, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
via 1.1.1.1, dummy1, weight 1, 00:00:09
D> 4.4.4.16/32 [150/0] via 1.1.1.1 (recursive), weight 1, 00:00:19
* via 1.1.1.1, dummy1 onlink, weight 1, 00:00:19
via 4.4.4.1 (recursive), weight 1, 00:00:19
via 1.1.1.1, dummy1, weight 1, 00:00:19
via 4.4.4.2 (recursive), weight 1, 00:00:19
Then use sharpd to install 4.4.4.16 -> 4.4.4.1 pointing to that nexthop
group in decending order.
```
With these changes it prevents the growing ecmp above by disallowing
duplicates to be in the resolution decision. These nexthops are not
installed anyways so why should we be resolving to them?
Signed-off-by: Stephen Worley <sworley@nvidia.com>
David Lamparter [Mon, 1 Feb 2021 16:50:01 +0000 (17:50 +0100)]
lib/printf: disable `%n` specifier
We don't use `%n` anywhere, so the only purpose it serves is enabling
exploits.
(I thought about this initially when adding printfrr, but I wasn't sure
we don't use `%n` anywhere, and thought I'll check later, and then just
forgot it...)
Signed-off-by: David Lamparter <equinox@diac24.net>
sudhanshukumar22 [Thu, 12 Nov 2020 12:37:30 +0000 (04:37 -0800)]
zebra: treat vrf add for existing vrf as update
Description: When we get a new vrf add and vrf with same name, but different vrf-id already
exists in the database, we should treat vrf add as update.
This happens mostly when there are lots of vrf and other configuration being replayed.
There may be a stale vrf delete followed by new vrf add. This
can cause timing race condition where vrf delete could be missed and
further same vrf add would get rejected instead of treating last arrived
vrf add as update.
Treat vrf add for existing vrf as update.
Implicitly disable this VRF to cleanup routes and other functions as part of vrf disable.
Update vrf_id for the vrf and update vrf_id tree.
Re-enable VRF so that all routes are freshly installed.
Above 3 steps are mandatory since it can happen that with config reload
stale routes which are installed in vrf-1 table might contain routes from
older vrf-0 table which might have got deleted due to missing vrf-0 in new configuration.