Donald Sharp [Wed, 13 Dec 2017 15:58:55 +0000 (10:58 -0500)]
ospf6d: Don't assign to values that are never read
offset and offsetlen are never used without reassigning
in the code. So comment out the assignments and
in case we want to start using the code for snmp
changes in the future.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Rafael Zalamena [Mon, 11 Dec 2017 20:25:06 +0000 (18:25 -0200)]
isisd: fix l2 neighbor formations
Add a timestamp information for level 2 circuits, otherwise if the
circuit is marked as already processed on level 1 we will not process
level 2 areas.
Signed-off-by: Rafael Zalamena <rzalamena@opensourcerouting.org>
Donald Sharp [Thu, 7 Dec 2017 15:47:30 +0000 (10:47 -0500)]
zebra: Try to move non zapi cli commands from zserv.c
zserv.c has become a bit of a dumping ground for zebra cli.
I'd like to focus the zserv.c code into it's core functionality
which is handling the zapi interface.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Rafael Zalamena [Tue, 12 Dec 2017 13:47:04 +0000 (11:47 -0200)]
isisd: save a clock_gettime() call
Use the thread cached clock to use as start time. It will save a call to
clock_gettime() and also provide a more 'accurate' time measurement from
the start of the procedure.
Donald Sharp [Tue, 5 Dec 2017 15:09:36 +0000 (10:09 -0500)]
bgpd: Cleanup SA error in ignoring return from function
Ignoring the return from argv_find_and_parse_afi
makes the SA system assume that you could pass a AFI_MAX
value to bgp_show_route. Which in turn would cause
an array out of bounds read.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 5 Dec 2017 14:44:11 +0000 (09:44 -0500)]
bgpd: Fix prefix2str using BUFSIZ to PREFIX_STRLEN
PREFIX_STRLEN is the correct length for buffers needed to output
a prefix2str. Additionally cleanup some setting of the last
value to a `\0` this is handled by prefix2str.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Fri, 8 Dec 2017 17:33:53 +0000 (09:33 -0800)]
ospfd: prevent passive interface cmd crash
Current OSPF VRF configuration are allow pre-provisining even if
VRF is not configured. In such case ospf->vrf_id would VRF_UNKNOWN,
when passive interface configuration done under such ospf instance,
it would lookup all vrf_device and try to create ifp with unknown
vrf_id.
for passive interface config command lookup ifp for vrf_id is within range.
Donald Sharp [Thu, 7 Dec 2017 23:59:54 +0000 (18:59 -0500)]
ldpd: Switch over to new debug style
When compiling ldpd on a mac, there exists a #define MSG_SEND
which conflicts with a define in ldp_debug.h.
During discussion about this we decided that it would be
better to remove the macro massaging that was going on and
to just call our own #define for it.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Fri, 1 Dec 2017 01:45:12 +0000 (17:45 -0800)]
ospf6d: Fix multi nexthop route remove
Fix sorting of route storage to DB.
Fix two list comparison which allows route with
multiple nexthop to updates.
Ticket:CM-19025
Testing Done:
Configured a topology where ospf6 learn
ecmp route via one neighbor and upon removal
of intra-prefix route from origin, DUT removes
ECMP intra-prefix route from RIB.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Don Slice [Wed, 6 Dec 2017 17:00:48 +0000 (09:00 -0800)]
zebra: do not display ipv6 ra commands created by bgpd
If the frr.conf file contains bgp unnumbered peering but the associated
interfaces do not have the commands "no ipv6 nd suppress-ra" and
"ipv6 nd ra-interval 10" configured, when frr-reload.py is issued the
interface commands are removed from the running config, causing peers to
got down and stay down after a link flap. This situation can occur if
the frr.conf file is created manually or via automation (like ansible)
but a subsequent "wr mem" has not been performed.
This fix changes the behavior so that the interface ipv6 nd ra commands
created by bgp are not displayed. Therefore, when the above condition
occurs, there is no difference between the running and stored configs
and peers work fine.
Ticket: CM-18702 Signed-off-by: Don Slice <dslice@cumulusnetworks.com> Reviewed-by: CCR-7004
Testing-done: Manual testing successful. L3-smoke has no new failures
Chirag Shah [Mon, 4 Dec 2017 22:08:23 +0000 (14:08 -0800)]
ospfd: fix crash no router ospf/show running
no router ospf removes default ospf instance,
if there are other non-default vrf instance present
with interface level configuration. Lookup ospf instance
for ifp->vrf_id, if ospf instnace present use that
to access 'instance id'.
Ticket: CM-19078
Testing Done:
run no router ospf and show running config along with other
non-default vrf aware ospf configurations.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Chirag Shah [Mon, 27 Nov 2017 21:59:19 +0000 (13:59 -0800)]
ospfd: Display all vrf aware interface config
OSPF interface specific configuration can be done independent
of router ospf [vrf x] global config.
In cases where ospf interface non default vrf configuration
is done prior to 'router ospf vrf x', show running-config
would not display such configuration.
To display configuration now walk all vrfs and interface list
and only display where OSPF configure params are set.
Ticket:CM-18952
Testing Done:
Tried ospf interface specific configuration with VRF,
where router ospf vrf x is not present.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Donald Sharp [Fri, 1 Dec 2017 16:49:13 +0000 (11:49 -0500)]
bgpd: Allow Address-Family activation to work in certain states
If we are in OpenSent or OpenConfirm peer state and we receive a new
address-family activation, we would end up ignoring the new activation
and not tell our peer about it. You could notice this by seeing
the fact that a 'show bgp neighbor' command returns a 'Not in
any update group' for a particular family.
This modifies the code such that we now notice that we are in
either OpenSent or OpenConfirm state and reset the peer to
allow us to send them the new capability.
Ticket: CM-19021 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Quentin Young [Thu, 30 Nov 2017 19:11:12 +0000 (14:11 -0500)]
bgpd: intelligently adjust coalesce timer
The subgroup coalesce timer controls how long updates to a particular
subgroup are delayed in order to allow additional peers to join the
subgroup. Presently the timer value is 200 ms. Increase it to 1 second
and adjust up as peers are configured, with an upper cap at 10s.
This cuts convergence time by a factor of 3 at large scale (300+ peers,
1000+ prefixes per peer).
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 13 Nov 2017 22:59:04 +0000 (17:59 -0500)]
bgpd: turn off keepalives when sending NOTIFY
This is necessary because otherwise between the time we wipe the output
buffer and the time we push the NOTIFY onto it, the KA generation thread
could have pushed a KEEPALIVE in the middle.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 13 Nov 2017 08:18:49 +0000 (03:18 -0500)]
bgpd: yield more when generating UPDATEs
In the same vein as the round-robin input commit, this re-adds logic for
limiting the amount of time spent generating UPDATEs per generation
cycle. Missed this when shifting around wpkt_quanta; prior to MT it
limited both calls to write() as well as UPDATE generation.
Quentin Young [Fri, 10 Nov 2017 21:42:49 +0000 (16:42 -0500)]
bgpd: restore packet input limit
Unfortunately, batching input processing severely impacts BGP initial
convergence times. As a consequence of the way update-groups were
implemented, advancing the state of the routing table based on prefixes
learned from one peer prior to all (or at least most) peers establishing
connections will cause us to start generating outbound UPDATEs, which is
a very expensive operation at present. This intensive processing starves
out bgp_accept(), delaying connection of additional peers. When
additional peers do connect the problem gets worse and worse, yielding
approximately exponential growth in convergence time dependent on both
peering and prefix counts. This behavior is present pre-multithreading
as well, but batched input exacerbates it.
Round-robin input processing marginally harms convergence times for
small topologies but should allow much larger topologies to function
within reasonable performance thresholds.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Tue, 7 Nov 2017 07:49:54 +0000 (02:49 -0500)]
bgpd: schedule process packet as timer
Different places scheduling the same thread should use the same
semantics and thread type. Additionally providing the back reference
here makes sure we only schedule the job once and avoids flooding the
event queue with jobs to process an empty buffer.
Quentin Young [Mon, 6 Nov 2017 19:15:36 +0000 (14:15 -0500)]
bgpd: re-add write trigger logic
Apparently I didn't fully understand how subgroup packets make their way
out to individual peers. Turns out (on the base branch) we just busy
poll while waiting for packets to make their way onto subgroup queues.
While this needs to be fixed in the future, for now readding this logic
fixes performance issues with convergence.
Quentin Young [Mon, 6 Nov 2017 06:41:27 +0000 (01:41 -0500)]
bgpd: properly set peer->last_update
Instead of checking whether the post-write number of updates sent was
greater than the pre-write number of updates sent, it was comparing post
to zero. In effect this meant every time we wrote a packet it was
counted as an update for route advertisement timer purposes.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 6 Nov 2017 05:33:46 +0000 (00:33 -0500)]
bgpd: schedule packet job after connection xfer
During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input and a
reorganized packet processor, the following race condition manifests:
1. Remote peer initiates a connection. After exchanging OPEN messages,
we send them a KEEPALIVE. They send us a KEEPALIVE followed by
10,000 UPDATE messages. The I/O thread pushes these onto our local
peer's input buffer and schedules a packet processing job on the
main thread.
2. The packet job runs and processes the KEEPALIVE, which completes the
handshake on our end. As part of transferring to ESTABLISHED we
transfer all peer state to a new struct, as mentioned. Upon returning
from the KEEPALIVE processing routing, the peer context we had has
now been destroyed. We notice this and stop processing. Meanwhile
10k UPDATE messages are sitting on the input buffer.
3. N seconds later, the remote peer sends us a KEEPALIVE. The I/O thread
schedules another process job, which finds 10k UPDATEs waiting for
it. Convergence is achieved, but has been delayed by the value of the
KEEPALIVE timer.
The racey part is that if the remote peer takes a little bit of time to
send UPDATEs after KEEPALIVEs -- somewhere on the order of a few hundred
milliseconds -- we complete the transfer successfully and the packet
processing job is scheduled on the new peer upon arrival of the UPDATE
messages. Yuck.
The solution is to schedule a packet processing job on the new peer
struct after transferring state.
Lengthy commit message in case someone has to debug similar problems in
the future...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Fri, 3 Nov 2017 18:47:56 +0000 (14:47 -0400)]
bgpd: transfer raw input buffer to new peer
During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input, I forgot
to transfer the raw input buffer to the new peer. This resulted in
infrequent failures during session handshaking whereby half of a packet
would be thrown away in the middle of a read causing us to send a NOTIFY
for an unsynchronized header. Usually the transfer coincided with a
clean input buffer, hence why it only showed up once in a while.
Quentin Young [Mon, 25 Sep 2017 02:18:15 +0000 (22:18 -0400)]
bgpd: fix bgp active open
At some point when rearranging FSM code, bgpd lost the ability to
perform active opens because it was only paying attention to POLLIN and
not POLLOUT, when the latter is used to signify a successful connection
in the active case.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 12 Jun 2017 21:16:40 +0000 (21:16 +0000)]
bgpd: be more promiscuous with updgrp packets
Slightly incorrect trigger for generating update group packets. In order
to match semantics of previous bgp_write() we need to trigger
update-group packet generation after every write operation, even if no
packets were written. Of course if we're tearing down the session we can
still skip this operation.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 12 Jun 2017 20:20:50 +0000 (20:20 +0000)]
bgpd: re-add update-group write triggers
Removed in earlier version where the I/O pthread busy-waited for packets
to be posted to an output queue. Now that it's poll()-based, it's
necessary once again. Although this time we can say what we're actually
doing instead of a side effect of a write job.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Mon, 12 Jun 2017 02:53:42 +0000 (02:53 +0000)]
bgpd: misc fsm fixes
* Keepalive on/off calls are necessary in certain cases due to screwy
fsm flow not turning them on after transferring a passive peer
connection in peer_xfer_conn
* Missed a case bgp_event_update() that resulted in a return code of -1
instead of BGP_Stop, which confuses the packet processing routine
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
Quentin Young [Sat, 10 Jun 2017 01:01:56 +0000 (01:01 +0000)]
bgpd: fix bgp_packet.c / bgp_fsm.c organization
Despaghettification of bgp_packet.c and bgp_fsm.c
Sometimes we call bgp_event_update() inline packet parsing.
Sometimes we post events instead.
Sometimes we increment packet counters in the FSM.
Sometimes we do it in packet routines.
Sometimes we update EOR's in FSM.
Sometimes we do it in packet routines.
Fix the madness.
bgp_process_packet() is now the centralized place to:
- Update message counters
- Execute FSM events in response to incoming packets
FSM events are now executed directly from this function instead of being
queued on the thread_master. This is to ensure that the FSM contains the
proper state after each packet is parsed. Otherwise there could be race
conditions where two packets are parsed in succession without the
appropriate FSM update in between, leading to session closure due to
receiving inappropriate messages for the current FSM state.
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>