Pat Ruddy [Fri, 22 Jan 2021 10:11:22 +0000 (10:11 +0000)]
test: add snmp skip test
Since SNMP is a pain to install add a check which will be used
in all SNMP tests in future to silently skip SNMP tests if SNMP
has not been installed on the base system.
Sarita Patra [Tue, 12 Jan 2021 10:46:35 +0000 (02:46 -0800)]
bgpd : multiple memory leak fixes in show commands
Issue: bgpd got kill due to out of memory, when show bgp
neighbor json and show ip bgp neighbor <ip> routes json
commands executed multiple times in a setup having 320554
routes.
RCA: Heap allocated for bgpd keeps increasing. This is verified
using top command and show memory command.
Memleak Fix-1: show ip bgp route json command
When dumping a large bit of table data via bgp_show_route
and if there is no information to display for a particular
struct bgp_node *` the data allocated via json_object_new_array()
is not freed. This is resolved now.
Memleak Fix-2:
The function bgp_peer_counts() doesn't free the memory allocated for
json_loop when there is No such neighbor or address family. This is
fixed now.
Donald Sharp [Thu, 21 Jan 2021 14:14:27 +0000 (09:14 -0500)]
bgpd: Add afi/safi info to debug processing data
When debugging in bgp is turned on for route-map processing
it would be awful nice to know what afi-safi we are working on
for the particular route-map. Especially when using a route-map
across different peers and different afi/safi's
Donatas Abraitis [Thu, 21 Jan 2021 13:03:40 +0000 (15:03 +0200)]
bgpd: Set NO_ADVERTISE community if blackhole community received
rfc7999:
A BGP speaker receiving an announcement tagged with the BLACKHOLE
community SHOULD add the NO_ADVERTISE or NO_EXPORT community as
defined in [RFC1997], or a similar community, to prevent propagation
of the prefix outside the local AS. The community to prevent
propagation SHOULD be chosen according to the operator's routing
policy.
Sent:
```
router bgp 65534
no bgp ebgp-requires-policy
neighbor 192.168.0.2 remote-as 65030
!
address-family ipv4 unicast
redistribute connected
neighbor 192.168.0.2 route-map spine out
exit-address-family
!
!
ip prefix-list self seq 5 permit 192.168.100.1/32
!
route-map spine permit 10
match ip address prefix-list self
set community blackhole
!
```
Received:
```
spine1-debian-9# show ip bgp 192.168.100.1/32
BGP routing table entry for 192.168.100.1/32
Paths: (1 available, best #1, table default, inform peer to blackhole prefix)
Not advertised to any peer
65534
192.168.0.1 from 192.168.0.1 (192.168.100.1)
Origin incomplete, metric 0, valid, external, best (First path received)
Community: blackhole no-advertise
Last update: Thu Jan 21 12:56:39 2021
```
Donatas Abraitis [Thu, 21 Jan 2021 13:00:26 +0000 (15:00 +0200)]
lib: List all possible well-known communities in CLI (COMMUNITY_VAL_STR)
```
exit1-debian-9(config-route-map)# set community
AA:NN Community number in AA:NN format (where AA and NN are (0-65535)) or local-AS|no-advertise|no-export|internet|graceful-shutdown|accept-own-nexthop|accept-own|route-filter-translated-v4|route-filter-v4|route-filter-translated-v6|route-filter-v6|llgr-stale|no-llgr|blackhole|no-peer or additive
none No community attribute
```
Donald Sharp [Sun, 17 Jan 2021 21:08:03 +0000 (16:08 -0500)]
bgpd: Use uint32_t for size value instead of int in ecommunity struct
The `struct ecommunity` structure is using an int for a size value.
Let's switch it over to a uint32_t for size values since a size
value for data can never be negative.
Donald Sharp [Sun, 17 Jan 2021 12:51:09 +0000 (07:51 -0500)]
zebra: Tell SA that we are intentionally ignoring the return
Calling fpm_nl_enqueue we should expect a it fit or not
return value on the outgoing stream. This is not necessary
to check here because the while loop where we are checking this
already has ensured that the data being written will fit.
CID -> 1499854 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Sun, 17 Jan 2021 12:43:44 +0000 (07:43 -0500)]
bgpd: attr is already derefed cannot be null here
In the function bgp_adj_out_set_subgroup, the attr pointer
is already derefed in all paths leading to a test for NULL.
You cannot pass a NULL attribute in since the whole function
would just immediately crash.
CID -> 1500604 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Sat, 16 Jan 2021 13:29:49 +0000 (08:29 -0500)]
tests: Set default timers to 3/10 for bgp using create_router_bgp
Tests were timing out in our test system due to lost packets and
flakiness of the lower end systems. Just set the timers to 3/10
and give them plenty of time to converge.
Donald Sharp [Fri, 15 Jan 2021 21:28:15 +0000 (16:28 -0500)]
zebra: A `zebra route-map delay-timer 0` command should still run the route-map
Setting `zebra route-map delay-timer 0` completely turns of any
route-map processing in zebra. Which is completely wrong. A timer
of 0 means `do it now`.
Donald Sharp [Thu, 1 Oct 2020 22:24:01 +0000 (18:24 -0400)]
tests: Modify zebra_rib tests to include some basic route-map tests
New test does this:
a) Ensures that we run the correct number of times given two
`ip protocol X` commands( ie we do not run the route-map application
against all routes, only those affected )
b) Ensure that when we modify the route-map the state ends up sane
this includes making a static route depend on a sharp route that
gets removed from the change of the sharp route-map
c) Ensure that the kernel routes are correct.
Donald Sharp [Fri, 9 Oct 2020 15:41:21 +0000 (11:41 -0400)]
zebra: Push timer out if another route-map change comes in for zebra
If we are running with a delayed timer to handle route-map changes
in zebra, if another route-map change is made to the cli, push
out the timer instead of not modifying the timer. This will
allow a large set of route-maps to be possibly be read in by
the system and we don't have a state where new route-map
changes are being read in and having the timer pop in
the middle of it.
Additionally convert to use THREAD_OFF, preventing a possible
use after free as well as aligning the thread api usage
with what we consider correct.
Donald Sharp [Thu, 1 Oct 2020 15:18:45 +0000 (11:18 -0400)]
zebra: Limit routemap changes to reconsider only routes associated with that rm
Current code when a route map changes schedules a rerun of all routes in the
particular table. So if you modify the `ip protocol XX route-map FOO`
route-map `FOO` all routes will be rechecked. This is extremely expensive.
Modify zebra to only update the routes associated with the route-map. So
if we have 800k bgp routes and 50 ospf routes and we are route-map'ing
the ospf routes we'll only look at 50 routes.
Donald Sharp [Thu, 1 Oct 2020 13:54:53 +0000 (09:54 -0400)]
zebra: Allow rib_update_table to receive a specified route type
When we need to cause a reprocessing of data the code currently
marks all routes as needing to be looked at. Modify the
rib_update_table code to allow us to specify a specific route
type we only want to reprocess. At this point none
of the code is behaving differently this is just setup
for a future code change.
Donald Sharp [Fri, 2 Oct 2020 11:18:58 +0000 (07:18 -0400)]
lib: Keep track of route-map applications per section
When the routemap code was rewritten for performance the
code to track the number of times a particular section of
a route-map was applied was not correctly updated. In
this case I found another sequence of events where the
number of times a section was invoked was not being correctly
kept.
Effectively in this case when route_map_get_index is called
and returns an index the route map has been applied( see that
skip_match_clause is set to true and then in the for loop
below the skip_match_clause is tested and index->applied is
incremented.
Renato Westphal [Fri, 15 Jan 2021 15:04:24 +0000 (12:04 -0300)]
ldpd: fix sporadic failures in the ldp-topo1 topotest
Commit 220e848cc5 introduced an optimization that would prevent ldpd
from sending redundant label mappings when it receives notifications
from zebra about routes that didn't effectively change (such
notifications can happen under certain circumstances).
The problem is that that commit didn't take into account the metric
of the received routes, so it would dismiss a notification of a
route with a better metric taking the place of another route in the
RIB, preventing the newly selected route from receiving the label
mappings it needs.
Revert 220e848cc5 temporarily to fix sporadic failures in the CI
system until we have a better solution.
Duncan Eastoe [Fri, 15 Jan 2021 16:06:17 +0000 (16:06 +0000)]
zebra: set nlmsg_pid in netlink msgs sent by 'fpm'
Use nl_pid from the netlink socket used for programming the kernel
(netlink_dplane) in netlink route messages sent by the 'fpm' module.
This makes 'fpm' consistent with 'dplane_fpm_nl' which already
behaves this way, and allows FPM server implementations to determine
route origin via nlmsg_pid.
Donald Sharp [Fri, 15 Jan 2021 13:14:49 +0000 (08:14 -0500)]
bgpd: Allow peer-groups to have `ttl-security hops` configured
The command `neighbor PGROUP ttl-security hops X` was being
accepted but ignored. Allow it to be stored. I am still
not sure that this is applied correctly, but that is another
problem.
Fixes: #7848 Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Donald Sharp [Fri, 15 Jan 2021 01:29:14 +0000 (20:29 -0500)]
tests: Start the ability to mark tests
Add the ability for our topotests to take advantage of pytest `mark`ing.
This effectively allows you to tell pytest to run against certain sets
of tests. For a demonstration purpose I've added in marks for:
babel
eigrp
ldp
ospf
pim
rip
And setup tests to run against those tests that only test those protocols.
You can run against eigrp tests by running `pytest -k eigrp`
Other combinations are also available based upon simple boolean logic.
Just read the pytest.mark documentation.
Donald Sharp [Thu, 14 Jan 2021 20:51:39 +0000 (15:51 -0500)]
bgpd: Temp fix to allow numbered peers to be part of a peer group
Talking w/ Chirag and he indicated that we can just backout the command
to the original and things would `work` and they do( at least a quick test does )
Dewi Morgan [Thu, 14 Jan 2021 14:01:26 +0000 (14:01 +0000)]
bgpd: clear max prefix overflow on de-config
A bgp neighbor remains in Idle state in the event that the number
of received prefixes exceeds the configured maximum prefix for the
neighbor. The neighbor remains in idle state even after de-configuring
the maximum prefix limit for the neighbor.
The fix is to clear the neighbor overflow state if set, after
de-configuring the neighbor maximum-prefix commnd.
This allows the neighbor to establish without having to perform a
clear operation. It also avoids the misleading neigbor summary
indicating that the neighbor is in prefix overflow state (PfxCt)
when no limit is configured for the neighbor.
Signed-off-by: Dewi Morgan <dewi.morgan@intl.att.com>
Chirag Shah [Wed, 13 Jan 2021 17:02:32 +0000 (09:02 -0800)]
staticd: handle when condition check in nb callbacks
At present, libyang validate api takes longer time to complete
for a transaction to complete if the same config is re-applied.
For instance if set of static routes are reapplied the config
completion takes longer than it took initial time.
One of the solution is to remove when statement from staticd nexthop yang OM.
When condition adds peformance toll on libyang's validate api.
The same when condition checks are done in frr northbound
validation phase (which are must faster).
With this change, if the same static routes are configured
agian and again, the time to completion does not go up and
perfomance does not degrade.
Ticket:CM-32530
Testing Done:
Configure 400 static routes across two vrfs and keep re-applying them.
The time to complete the config remains in few seconds.
Before:
root@bharat:~/stash/frr4# time vtysh -f static_route_cfg
real 0m19.877s
user 0m0.263s
sys 0m0.014s
After:
root@bharat:~/stash/frr4# time vtysh -f static_route_cfg
real 0m3.857s
user 0m0.239s
sys 0m0.034s
Co-developed-by: VishalDhingra <vdhingra@vmware.com> Signed-off-by: Chirag Shah <chirag@nvidia.com>