Stephen Worley [Wed, 8 Apr 2020 23:10:24 +0000 (19:10 -0400)]
zebra: read in and sweep rules on startup
On startup of zebra, read in all ipv4/ipv6 rules from
the kernel and remove any with the zebra proto.
If there are any, this means we failed to remove them
on shutdown due to a crash or something. Without this,
users have to manually remove them with iproute2 or some
such and its really annoying.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Tue, 7 Apr 2020 20:53:52 +0000 (16:53 -0400)]
pbrd: separate `set *` and `no set *` commands
Separate out the `set *` and `no set *` commands into
different DEFPYs to make the logic of the code easier to
read.
Further, allow non-exlpicit no commands.
So `no set nexthop`, `no set nexthop-group`, and
`no set vrf` will now work without having to specify
anymore data. Before you had to match what was already
there explicitly.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Wed, 18 Dec 2019 21:11:39 +0000 (16:11 -0500)]
pbrd: implement `set *` and `match *` config replacement
Implement the ability to replace any existing `set *` or
`match` with another one or adding more config without having
to first delete the original config already there.
Before, we needed to constantly execute a `no` command for everything
to remove the rule before making changes to it. With this
patch, you can replace configs on individual sequences much
easier.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Mon, 6 Apr 2020 18:13:20 +0000 (14:13 -0400)]
pbrd: free nexthop_group name on `no set nexthop-group`
Properly free the string pointed to by `pbrms->nhgrp_name`
when we are removiing the config for a nexthop group
on a pbr map sequence.
Found via memleak:
==3152214== 4 bytes in 1 blocks are definitely lost in loss record 308 of 8,814
==3152214== at 0x483980B: malloc (vg_replace_malloc.c:309)
==3152214== by 0x4DC9F7E: strdup (in /usr/lib64/libc-2.30.so)
==3152214== by 0x48E373E: qstrdup (memory.c:122)
==3152214== by 0x408FE7: pbr_map_nexthop_group_magic (pbr_vty.c:264)
==3152214== by 0x408E04: pbr_map_nexthop_group (pbr_vty_clippy.c:347)
==3152214== by 0x48ACF72: cmd_execute_command_real (command.c:1073)
==3152214== by 0x48ACB3B: cmd_execute_command (command.c:1133)
==3152214== by 0x48AD063: cmd_execute (command.c:1288)
==3152214== by 0x493D8EE: vty_command (vty.c:526)
==3152214== by 0x493D397: vty_execute (vty.c:1293)
==3152214== by 0x493C4EC: vtysh_read (vty.c:2126)
==3152214== by 0x49319DC: thread_call (thread.c:1548)
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Mon, 6 Apr 2020 16:51:55 +0000 (12:51 -0400)]
pbrd: delete pbr nhg cache after rlease from hash
Actually delete the allocated pbr_nhg_cache object we just
released.
Found via memory leak:
==3078405== 136 bytes in 1 blocks are definitely lost in loss record 8,282 of 8,802
==3078405== at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==3078405== by 0x48E35E8: qcalloc (memory.c:110)
==3078405== by 0x40EBA7: pbr_nhgc_alloc (pbr_nht.c:194)
==3078405== by 0x48CC0EB: hash_get (hash.c:148)
==3078405== by 0x40F825: pbr_nht_add_individual_nexthop (pbr_nht.c:534)
==3078405== by 0x409853: pbr_map_nexthop_magic (pbr_vty.c:400)
==3078405== by 0x4093F1: pbr_map_nexthop (pbr_vty_clippy.c:417)
==3078405== by 0x48ACF72: cmd_execute_command_real (command.c:1073)
==3078405== by 0x48ACB3B: cmd_execute_command (command.c:1133)
==3078405== by 0x48AD063: cmd_execute (command.c:1288)
==3078405== by 0x493D8EE: vty_command (vty.c:526)
==3078405== by 0x493D397: vty_execute (vty.c:1293)
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Stephen Worley [Thu, 19 Dec 2019 22:11:26 +0000 (17:11 -0500)]
zebra: define some explicit rule replace code paths
Define some explicit rule replace code paths into the dataplane
code and improve the handling around it/releasing the the old
rule from the hash table.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
1. Added 5 test cases to verify BGP AS-allow-in behavior in FRR
2. Enhanced framework to support BGP AS-allow-in config(lib/bgp.py)
3. Added API in bgp.py to verify BGP RIB table(lib/bgp.py)
David Lamparter [Wed, 8 Apr 2020 13:17:21 +0000 (15:17 +0200)]
tests: fix parallel build race
If we're building with a separate build directory, these two build
targets can fail in case their output directory hasn't been created by
some other target that may or may not have run earlier.
Signed-off-by: David Lamparter <equinox@diac24.net>
tests: Adding new test suite bgp_communities_topo1
1. Added 1 test case to verify NO-ADVERTISE Community functionality
2. Enhanced bgp.py to exclude routers from verification, if doesn't have bgp config
1. Added 5 test cases to verify BGP AS-allow-in behavior in FRR
2. Enhanced framework to support BGP AS-allow-in config(lib/bgp.py)
3. Added API in bgp.py to verify BGP RIB table(lib/bgp.py)
Mark Stapp [Thu, 26 Mar 2020 18:11:56 +0000 (14:11 -0400)]
lib: support replacement in the nexthop-group cli
Use more limited matching logic so that nexthops within a
nexthop-group are unique based only on vrf, type, and gateway.
Treat configuration of a nexthop that matches an existing
nexthop as a replace operation.
zebra should only check whether a get_chunk operation succeeded
when processing the response, rather than insde the get_chunk
call itself. Spllitting the request and response hooks was done
precisely to allow for asynchronous calls to an external label
manager; in this case, the requested chunk is not necessarily
going to be available at request time.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Yang constraints enforced by the northbound callbacks require that
the maximum lifetime be >= than (refresh interval + 300). When we are
moving from one config to another through frr-reload.py, we issue
a number of vtysh -c commands ('no lsp-refresh-interval level-1 500',
'no max-lsp-lifetime level-1 1000'), which reset these parameters to their
default values, respectively 900 and 1200. Depending on the actual
values in the current config, the order in which these commands are sent
might be the wrong one, in that we hit an invalid intermediate state and
make vtysh (and by extension frr-reload.py) return an error.
As a workaround, let's add a one-liner command that sets all these
inter-related parameters in one go, and make isisd display them as a
single line too, so that the diff will be computed as a single command.
The old individual commands are kept to ensure backwards compatibility.
Signed-off-by: Emanuele Di Pascale <emanuele@voltanet.io>
Quentin Young [Sun, 5 Apr 2020 21:11:25 +0000 (17:11 -0400)]
bgpd: fix multiple bugs with cluster_list attrs
Multiple different issues causing mostly UAFs but maybe other more
subtle things.
- Cluster lists were the only attributes whose pointers were not being
NULL'd when freed, resulting in heap UAF
- When performing an insert into the cluster hash, our temporary struct
used for hash_get() was inconsistent with our hash keying and
comparison functions. In the case of a zero length cluster list, the
->length field is 0 and the ->list field is NULL. When performing an
insert, we set the ->list field regardless of whether the length is 0.
This resulted in the two cluster lists hashing equal but not comparing
equal. Later, when removing one of them from the hash before freeing
it, because the key matched and the comparison succeeded (because it
was set to NULL *after* the search but *before* inserting into the
hash) we would sometimes release the duplicated copy of the struct,
and then free the one that remained in the hash table. Later accesses
constitute UAF. This is fixed by making sure the fields used for the
existence check match what is actually inserted into the hash when
that check fails.
This patch also makes cluster_unintern static, because it should be.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
lib: consolidate flexible array hack in a single place
Old gcc versions (< 5.x) have a bug that prevents C99 flexible
arrays from working properly on shared libraries.
We already have a hack in place to work around this problem, but it
needs to be replicated in every declaration of a frr_yang_module_info
variable within libfrr. This clearly isn't a good solution if we
consider that many more libfrr YANG modules are about to come in
the future.
This commit introduces a different workaround that operates within
the northbound layer itself, such that implementers of libfrr YANG
modules won't need to worry about this problem anymore.
lib, tools: silence harmless warnings in the northbound tools
Our two northbound tools don't have embedded YANG modules like the
other FRR binaries. As such, ly_ctx_set_module_imp_clb() shouldn't be
called when the YANG subsystem it being initialized by a northbound
tool. To make that possible, add a new "embedded_modules" parameter
to the yang_init() function to control whether libyang should look
for embedded modules or not.
With this fix, "gen_northbound_callbacks" and "gen_yang_deviations"
won't emit "YANG model X not embedded, trying external file"
warnings anymore.
David Lamparter [Thu, 2 Apr 2020 19:16:04 +0000 (21:16 +0200)]
bgpd, ospfd, ospf6d: long is not bool :(
... Oops ...
(for context, the defaults code originally didn't have a dedicated
"bool" variant and just used long for bools... I derp'd this when
adding bool as a separate case :( )
Reported-by: Donald Sharp <sharpd@cumulusnetworks.com> Signed-off-by: David Lamparter <equinox@diac24.net>
Stephen Worley [Wed, 1 Apr 2020 19:31:40 +0000 (15:31 -0400)]
zebra: free unhashable (dup) NHEs via ID table cleanup
Free unhashable (duplicate NHEs from the kernel) via ID table
cleanup. Since the NHE ID hash table contains extra entries,
that's the one we need to be calling zebra_nhg_hash_free()
on, otherwise we will never free the unhashable NHEs.
This was found via a memleak:
==1478713== HEAP SUMMARY:
==1478713== in use at exit: 10,267 bytes in 46 blocks
==1478713== total heap usage: 76,810 allocs, 76,764 frees, 3,901,237 bytes allocated
==1478713==
==1478713== 208 (88 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 35 of 41
==1478713== at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1478713== by 0x48E35E8: qcalloc (memory.c:110)
==1478713== by 0x451CCB: zebra_nhg_alloc (zebra_nhg.c:369)
==1478713== by 0x453DE3: zebra_nhg_copy (zebra_nhg.c:379)
==1478713== by 0x452670: nhg_ctx_process_new (zebra_nhg.c:1143)
==1478713== by 0x4523A8: nhg_ctx_process (zebra_nhg.c:1234)
==1478713== by 0x452A2D: zebra_nhg_kernel_find (zebra_nhg.c:1294)
==1478713== by 0x4326E0: netlink_nexthop_change (rt_netlink.c:2433)
==1478713== by 0x427320: netlink_parse_info (kernel_netlink.c:945)
==1478713== by 0x432DAD: netlink_nexthop_read (rt_netlink.c:2488)
==1478713== by 0x41B600: interface_list (if_netlink.c:1486)
==1478713== by 0x457275: zebra_ns_enable (zebra_ns.c:127)
Repro with:
ip next add id 1 blackhole
ip next add id 2 blackhole
valgrind /usr/lib/frr/zebra
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
lynne [Sun, 29 Mar 2020 17:47:36 +0000 (13:47 -0400)]
ldpd: fixing host-only configuration filter.
There is configuration in LDP to only create labels for
host-routes. If the user remove this configuration the code
was not readvertising non-host routes to it's LDP neighbors.
The issue is the same in reverse also. If the user adds this
configuration on an active LDP session the non-host routes were
not withdrawn.
Donald Sharp [Tue, 31 Mar 2020 11:55:17 +0000 (07:55 -0400)]
ospf6d: Recent changes in our build cause const to be respected
We are seeing this crash:
New LWP 7673]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/ospf6d -d -F datacenter -M snmp -A ::1'.
Program terminated with signal SIGABRT, Aborted.
(gdb) bt
vtysh=vtysh@entry=0) at lib/command.c:1288
(gdb)
The command entered is `debug ospf6 lsa inter-router examin`. Code
inspection leads us to the fact that FRR is declaring the data as
const but we are attempting to modify it, causing the crash.
Remvoe the const of this set/get and let things work.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>