Rafael Zalamena [Mon, 30 Sep 2019 13:34:49 +0000 (10:34 -0300)]
lib: implement route map northbound
Based on the route map old CLI, implement the route map handling using
the exported functions.
Use a curry-like programming pattern avoid code repetition when
destroying match/set entries. This is needed by other daemons that
implement custom route map functions and need to pass to lib their
specific destroy functions.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Rafael Zalamena [Fri, 27 Sep 2019 22:32:10 +0000 (19:32 -0300)]
yang: update route map model
Important changes:
* Rename top container `route-map` to `lib`;
* Rename list `instance` to `route-map`;
* Move route map repeated data to list `entry`;
* Use interface reference instead of typedef'ed string;
* Remove some zebra specific route map conditions;
* Protect `tag` set value with `when "./action = 'tag'"`;
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Stephen Worley [Tue, 4 Feb 2020 16:35:31 +0000 (11:35 -0500)]
doc: indicate non-support for dynamic pbr maps
Indicate in the doc clearly that we do not support the dynamic
creation of pbr maps for sub-interfaces when the master has
an interface. Such as in vlans. It must be explicitly added.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Donald Sharp [Mon, 13 Jan 2020 21:11:46 +0000 (16:11 -0500)]
zebra: nexthop groups vrf's are only a function of namespaces
Nexthop groups as a whole do not make sense to have a vrf'ness
As that you can have a arbitrary number of nexthops that point
to separate vrf's.
Modify the code to make this distinction, by clearly delineating
the line between the nhg and the nexthop a bit better.
Nexthop groups having a vrf_id only make sense if you are using
network namespaces to represent them.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Mon, 13 Jan 2020 21:00:33 +0000 (16:00 -0500)]
zebra: Modify 'show nexthop-group rib ip|ipv6'
The zebra implementation of nexthop groups has
two types of nexthops groups currently. Singleton
objects which have afi's and combined nexthop groups
that do not. Specifically call this out in the code
to make this distinction.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Santosh P K [Thu, 9 Jan 2020 17:53:27 +0000 (09:53 -0800)]
zebra: Capabality and stale route handling for GR client.
Handling capability received from client. It may contain
GR enable/disable, Stale time changes, RIB update complete
for given AFi, ASAFI and instance. It also has changes for
stale route handling.
Stephen Worley [Mon, 6 Jan 2020 18:33:45 +0000 (13:33 -0500)]
zebra: reset nexthop pointer in zread of nexthops
We were not resetting the nexthop pointer to NULL for each
new read of a nexthop from the zapi route. On the chance we
get a nexthop that does not have a proper type, we will not
create a new nexthop and update that pointer, thus it still
has the last valid one and will create a group with two
pointers to the same nexthop.
Then when it enters any code that iterates the group, it loops
endlessly.
This was found with zapi fuzzing.
```
0x00007f728891f1c3 in jhash2 (k=<optimized out>, length=<optimized out>, initval=12183506) at lib/jhash.c:138
0x00007f728896d92c in nexthop_hash (nexthop=<optimized out>) at lib/nexthop.c:563
0x00007f7288979ece in nexthop_group_hash (nhg=<optimized out>) at lib/nexthop_group.c:394
0x0000000000621036 in zebra_nhg_hash_key (arg=<optimized out>) at zebra/zebra_nhg.c:356
0x00007f72888ec0e1 in hash_get (hash=<optimized out>, data=0x7ffffb94aef0, alloc_func=0x0) at lib/hash.c:138
0x00007f72888ee118 in hash_lookup (hash=0x7f7288de2f10, data=0x7f728908e7fc) at lib/hash.c:183
0x0000000000626613 in zebra_nhg_find (nhe=0x7ffffb94b080, id=0, nhg=0x6020000032d0, nhg_depends=0x0, vrf_id=<optimized out>,
afi=<optimized out>, type=<optimized out>) at zebra/zebra_nhg.c:541
0x0000000000625f39 in zebra_nhg_rib_find (id=0, nhg=<optimized out>, rt_afi=AFI_IP) at zebra/zebra_nhg.c:1126
0x000000000065f953 in rib_add_multipath (afi=AFI_IP, safi=<optimized out>, p=0x7ffffb94b370, src_p=0x0, re=0x6070000013d0,
ng=0x7f728908e7fc) at zebra/zebra_rib.c:2616
0x0000000000768f90 in zread_route_add (client=0x61f000000080, hdr=<optimized out>, msg=<optimized out>, zvrf=<optimized out>)
at zebra/zapi_msg.c:1596
0x000000000077c135 in zserv_handle_commands (client=<optimized out>, msg=0x61b000000780) at zebra/zapi_msg.c:2636
0x0000000000575e1f in main (argc=<optimized out>, argv=<optimized out>) at zebra/main.c:309
```
Stephen Worley [Mon, 6 Jan 2020 17:58:41 +0000 (12:58 -0500)]
zebra: don't created connected if duplicate depend
Since we are using a UNIQUE RB tree, we need to handle the
case of adding in a duplicate entry into it.
The list API code returns NULL when a successfull add
occurs, so lets pull that handling further up into
the connected handlers. Then, free the allocated
connected struct if it is a duplicate.
This is a pretty unlikely situation to happen.
Also, pull up the RB handling of _del RB API as well.
This was found with the zapi fuzzing code.
```
==1052840==
==1052840== 200 bytes in 5 blocks are definitely lost in loss record 545 of 663
==1052840== at 0x483BB1A: calloc (vg_replace_malloc.c:762)
==1052840== by 0x48E1008: qcalloc (memory.c:110)
==1052840== by 0x44D357: nhg_connected_new (zebra_nhg.c:73)
==1052840== by 0x44D300: nhg_connected_tree_add_nhe (zebra_nhg.c:123)
==1052840== by 0x44FBDC: depends_add (zebra_nhg.c:1077)
==1052840== by 0x44FD62: depends_find_add (zebra_nhg.c:1090)
==1052840== by 0x44E46D: zebra_nhg_find (zebra_nhg.c:567)
==1052840== by 0x44E1FE: zebra_nhg_rib_find (zebra_nhg.c:1126)
==1052840== by 0x45AD3D: rib_add_multipath (zebra_rib.c:2616)
==1052840== by 0x4977DC: zread_route_add (zapi_msg.c:1596)
==1052840== by 0x49ABB9: zserv_handle_commands (zapi_msg.c:2636)
==1052840== by 0x428B11: main (main.c:309)
```
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Santosh P K [Thu, 9 Jan 2020 16:39:50 +0000 (08:39 -0800)]
zebra: Handling of connection disconnect and connect with GR.
Zebra will have special handling for clients with GR enabled.
When client disconnects with GR enabled, then a stale client
will be created and its RIB will be retained till stale timer
or client comes up and updated its RIB.
Co-authored-by: Santosh P K <sapk@vmware.com> Co-authored-by: Soman K S <somanks@vmware.com> Signed-off-by: Santosh P K <sapk@vmware.com>
Santosh P K [Tue, 22 Oct 2019 14:01:58 +0000 (07:01 -0700)]
zebra: Header file changes and show commands.
Adding header files changes where structure to hold
received graceful restart info from client is defined.
Also there are changes for show commands where exisiting
commands are extended.
Co-authored-by: Santosh P K <sapk@vmware.com> Co-authored-by: Soman K S <somanks@vmware.com> Signed-off-by: Santosh P K <sapk@vmware.com>
Santosh P K [Wed, 29 Jan 2020 06:50:27 +0000 (22:50 -0800)]
lib: Adding GR capabilites encode and decode.
For Graceful restart clients have to send GR capabilities
library functions are added to encode capabilities and
also for zebra to decode client capabilities.
Co-authored-by: Santosh P K <sapk@vmware.com> Co-authored-by: Soman K S <somanks@vmware.com> Signed-off-by: Santosh P K <sapk@vmware.com>
Chirag Shah [Wed, 22 Jan 2020 20:22:27 +0000 (12:22 -0800)]
bgpd: fix memory leak in evpn json outputs
Found memory leak in json output of evpn's route
commands.
After executing 'show bgp l2vpn evpn route type prefix json'
and 'show bgp l2vpn evpn route type macip json' few times
(6 times) with more than 600 routes in total seeing
memory footprint for bgpd continue to grow.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 12 MiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 8390 KiB
Free small blocks: 1760 bytes
Free ordinary blocks: 3762 KiB
Ordinary blocks: 1161
Small blocks: 51
Holding blocks: 0
Ticket:CM-27920
Testing Done:
After fix:
excute few times,
'show bgp l2vpn evpn route type prefix json'
and 'show bgp l2vpn evpn route type macip json'
commands where used ordinary blocks (uordblks) is
in steady state.
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 9968 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 6486 KiB
Free small blocks: 1984 bytes
Free ordinary blocks: 3482 KiB
Ordinary blocks: 1110
Small blocks: 54
Holding blocks: 0
Memory statistics for bgpd:
System allocator statistics:
Total heap allocated: 10100 KiB
Holding block headers: 0 bytes
Used small blocks: 0 bytes
Used ordinary blocks: 6488 KiB
Free small blocks: 1984 bytes
Free ordinary blocks: 3612 KiB
Ordinary blocks: 1113
Small blocks: 54
Holding blocks: 0
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
bisdhdh [Sat, 26 Oct 2019 16:40:33 +0000 (22:10 +0530)]
bgpd: Adding bgp peer route processing and EOR state Signalling from BGPD to Zebra.
* While the Deferral timer is running, signal route update pending
(ZEBRA_CLIENT_ROUTE_UPDATE_PENDING) from BGPD to Zebra.
* After expiry of the Deferral timer, the deferred routes are processed.
When the deferred route_list becomes empty, End-of-Rib is send to the
peer and route processing complete message (ZEBRA_CLIENT_ROUTE_UPDATE_COMPLETE)
is sent to Zebra. So that Zebra would delete any stale routes still
present in the rib.
bisdhdh [Fri, 25 Oct 2019 20:20:27 +0000 (01:50 +0530)]
bgpd: Add rib-stale-time(running in Zebra).
* Added CLI commands to update rib-stale-time, running in
Cmd : "bgp gaceful-restart rib-stale-time (1-3000)".
Cmd : "no bgp gaceful-restart rib-stale-time".
* Integrating the hooks function for signalling from BGPD
to ZEBRA to ZEBRA to enable or disable GR feature in ZEBRA
depending on bgp per peer gr configuration.
bisdhdh [Fri, 25 Oct 2019 19:58:45 +0000 (01:28 +0530)]
bgpd: Adding helper caller hooks for BGPD-ZEBRA integration for GR.
*Adding helper caller hooks function for signalling from BGPD
to ZEBRA to enable or disable GR feature in ZEBRA depending
on bgp per peer gr configuration.
bisdhdh [Fri, 25 Oct 2019 18:58:57 +0000 (00:28 +0530)]
bgpd: Zebra lib for Graceful Restart.
These changes are for Zebra lib in order to supportGraceful Restart
feature. These changes are addedtemporarily, until Zebra Graceful
Restart lib Pr is merged.
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com> Signed-off-by: Soman K S <somanks@vmware.com>
bisdhdh [Fri, 25 Oct 2019 15:42:39 +0000 (21:12 +0530)]
bgpd: Adding header files for BGPD-ZEBRA integration for GR.
Data Structures, function declaration and Macros forSignalling
from BGPD to ZEBRA to enable or disable GR feature in ZEBRA
depending on bgp per peer gr configuration.
bisdhdh [Thu, 24 Oct 2019 14:59:43 +0000 (20:29 +0530)]
bgpd: Restarting node does not send EOR after the convergence.
*After a restarting router comes up and the bgp session is
successfully established with the peer. If the restarting
router doesn’t have any route to send, it send EOR to
the peer immediately before receiving updates from its peers.
*Instead the restarting router should send EOR, if the
selection deferral timer is not running OR count of eor received
and eor required are matches then send EOR.
bisdhdh [Thu, 24 Oct 2019 10:21:18 +0000 (15:51 +0530)]
bgpd: Added hidden CLI command to disable sending of End-of-Rib.
BGP disable EOR sending is a useful command for testing various
scenarios of BGP graceful restart.
* Added the hidden CLI command : bgp graceful-restart disable-eor
* The CLI will not be displayed in "show running-config" and will not
be stored in configuration file.
* When enabled, EOR will not be sent to peer
Signed-off-by: Biswajit Sadhu <sadhub@vmware.com> Signed-off-by: Soman K S <somanks@vmware.com>
bisdhdh [Thu, 24 Oct 2019 04:40:53 +0000 (10:10 +0530)]
bgpd: BGP tcp session failed to apply GR configuration on the transferred
bgp tcp connection.
When the BGP peer is configured between two bgp routes both routers would create
peer structure , when they receive each other’s open message. In this event both
speakers, open duplicate TCP sessions and send OPEN messages on each socket
simultaneously, the BGP Identifier is used to resolve which socket should be closed.
If BGP GR is enabled the old tcp session is dumped and the new session is retained.
So while this transfer of connection is happening, if all the bgp gr config
is not migrated to the new connection, the new bgp gr mode will never get applied.
Fix Summary:
1. Replicate GR configuration from the old session to the new session in bgp_accept().
2. Replicate GR configuration from stub to full-fledged peer in bgp_establish().
3. Disable all NSF flags, clear stale routes (if present), stop restart & stale timers
(if they are running) when the bgp GR mode is changed to “Disabled”.
4. Disable R-bit in cap, if it is not set the received open message.
bisdhdh [Thu, 24 Oct 2019 03:34:05 +0000 (09:04 +0530)]
bgpd: show BGP GR Neighbor mode as “NotApplicable”,when local mode is “Disable”.
BGP GR Neighbor mode is showing the default string as “NotRecieved”,
as the bgp gr neighbour capability was not processed,
since the local mode is “Disable”.
However now it would be changed to “NotApplicable”.
bisdhdh [Wed, 23 Oct 2019 19:15:43 +0000 (00:45 +0530)]
bgpd: Fix for BGP core when connected routes are redistributed
& GR is enabled.
When GR with deferral is enabled and connected routes are
distributed then in one race condition route node gets added
in to both deferred queue and work queue. If deferred queue
gets processed first then it ends up delete only flag while
leaving the entry in the work queue as it is. When a new update
comes for the same route node next time from peer then it hits
assert. Assert check is added to ensure we don’t add to work queue
again while it is already present.
So, check before adding in to deferred queue if it is already present
in work queue and bail if so.
bisdhdh [Wed, 23 Oct 2019 19:12:10 +0000 (00:42 +0530)]
bgpd: Adding BGP GR change mode config apply on notification sent & received.
* Changing GR mode on a router needs a session reset from the
SAME router to negotiate new GR capability.
* The present GR implementation needs a session reset after every
new BGP GR mode change.
* When BGP session reset happens due to sending or receiving BGP
notification after changing BGP GR mode, there is no need of
explicit session reset.