]> git.puffer.fish Git - matthieu/frr.git/log
matthieu/frr.git
7 years agozebra: ipv6 addressing uses netlink socket instead of standard ioctl
Philippe Guibert [Mon, 11 Dec 2017 14:21:04 +0000 (15:21 +0100)]
zebra: ipv6 addressing uses netlink socket instead of standard ioctl

It is possible to configure IPv6 addresses from interfaces by using
netlink socket, intead of using standard sockets.

Signed-off-by: Philippe Guibert <philippe.guibert@6wind.com>
7 years agoMerge pull request #1478 from bingen/zeromq4
Donald Sharp [Wed, 13 Dec 2017 12:36:57 +0000 (07:36 -0500)]
Merge pull request #1478 from bingen/zeromq4

lib: Address ZMQ lib TODOs

7 years agoMerge pull request #1540 from opensourcerouting/isis-spfperf1
Donald Sharp [Tue, 12 Dec 2017 17:41:07 +0000 (12:41 -0500)]
Merge pull request #1540 from opensourcerouting/isis-spfperf1

isisd: save a clock_gettime() call

7 years agoMerge pull request #1539 from LabNConsulting/working/master/community-decisions
Donald Sharp [Tue, 12 Dec 2017 16:59:07 +0000 (11:59 -0500)]
Merge pull request #1539 from LabNConsulting/working/master/community-decisions

COMMUNITY.md: add paragraph on use of development list and discussing…

7 years agoMerge pull request #1514 from donaldsharp/watchfrr
Martin Winter [Tue, 12 Dec 2017 16:51:25 +0000 (08:51 -0800)]
Merge pull request #1514 from donaldsharp/watchfrr

tools, watchfrr: Modify timeout to 90 seconds

7 years agoisisd: save a clock_gettime() call
Rafael Zalamena [Tue, 12 Dec 2017 13:47:04 +0000 (11:47 -0200)]
isisd: save a clock_gettime() call

Use the thread cached clock to use as start time. It will save a call to
clock_gettime() and also provide a more 'accurate' time measurement from
the start of the procedure.

7 years agoCOMMUNITY.md: add paragraph on use of development list and discussing/documenting...
Lou Berger [Tue, 12 Dec 2017 13:42:54 +0000 (08:42 -0500)]
COMMUNITY.md: add paragraph on use of development list and discussing/documenting decisions

7 years agoMerge pull request #1526 from chiragshah6/ospfv3_dev
Jafar Al-Gharaibeh [Mon, 11 Dec 2017 17:51:43 +0000 (11:51 -0600)]
Merge pull request #1526 from chiragshah6/ospfv3_dev

ospf6d: Fix multi nexthop route remove

7 years agoMerge pull request #1524 from dslicenc/zebra-ra-display-cm18702
Renato Westphal [Mon, 11 Dec 2017 17:37:26 +0000 (15:37 -0200)]
Merge pull request #1524 from dslicenc/zebra-ra-display-cm18702

zebra: do not display ipv6 ra commands created by bgpd

7 years agoMerge pull request #1531 from chiragshah6/ospf_vrf_dev
Renato Westphal [Mon, 11 Dec 2017 12:55:03 +0000 (10:55 -0200)]
Merge pull request #1531 from chiragshah6/ospf_vrf_dev

ospfd: prevent passive interface cmd crash

7 years agoMerge pull request #1528 from donaldsharp/ldp_macro
Renato Westphal [Mon, 11 Dec 2017 00:38:02 +0000 (22:38 -0200)]
Merge pull request #1528 from donaldsharp/ldp_macro

ldpd: Switch over to new debug style

7 years agoospfd: prevent passive interface cmd crash
Chirag Shah [Fri, 8 Dec 2017 17:33:53 +0000 (09:33 -0800)]
ospfd: prevent passive interface cmd crash

Current OSPF VRF configuration are allow pre-provisining even if
VRF is not configured. In such case ospf->vrf_id would VRF_UNKNOWN,
when passive interface configuration done under such ospf instance,
it would lookup all vrf_device and try to create ifp with unknown
vrf_id.

for passive interface config command lookup ifp for vrf_id is within range.

Ticket:CM-19156
Testing Done:
Configure
Cumulus#: router ospf vrf vrf1
Cumulus(config-router)#: passive interface swp16
 interface swp16 not found.

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
7 years agoldpd: Switch over to new debug style
Donald Sharp [Thu, 7 Dec 2017 23:59:54 +0000 (18:59 -0500)]
ldpd: Switch over to new debug style

When compiling ldpd on a mac, there exists a #define MSG_SEND
which conflicts with a define in ldp_debug.h.

During discussion about this we decided that it would be
better to remove the macro massaging that was going on and
to just call our own #define for it.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agoMerge pull request #1519 from donaldsharp/ptm
Rafael Zalamena [Thu, 7 Dec 2017 14:37:10 +0000 (12:37 -0200)]
Merge pull request #1519 from donaldsharp/ptm

Ptm

7 years agoMerge pull request #1520 from donaldsharp/vrf_leaking
Rafael Zalamena [Thu, 7 Dec 2017 13:43:38 +0000 (11:43 -0200)]
Merge pull request #1520 from donaldsharp/vrf_leaking

Cleanup a bunch of code

7 years agoospf6d: Fix multi nexthop route remove
Chirag Shah [Fri, 1 Dec 2017 01:45:12 +0000 (17:45 -0800)]
ospf6d: Fix multi nexthop route remove

Fix sorting of route storage to DB.
Fix two list comparison which allows route with
multiple nexthop to updates.

Ticket:CM-19025
Testing Done:
Configured a topology where ospf6 learn
ecmp route via one neighbor and upon removal
of intra-prefix route from origin, DUT removes
ECMP intra-prefix route from RIB.

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
7 years agozebra: do not display ipv6 ra commands created by bgpd
Don Slice [Wed, 6 Dec 2017 17:00:48 +0000 (09:00 -0800)]
zebra: do not display ipv6 ra commands created by bgpd

If the frr.conf file contains bgp unnumbered peering but the associated
interfaces do not have the commands "no ipv6 nd suppress-ra" and
"ipv6 nd ra-interval 10" configured, when frr-reload.py is issued the
interface commands are removed from the running config, causing peers to
got down and stay down after a link flap.  This situation can occur if
the frr.conf file is created manually or via automation (like ansible)
but a subsequent "wr mem" has not been performed.

This fix changes the behavior so that the interface ipv6 nd ra commands
created by bgp are not displayed.  Therefore, when the above condition
occurs, there is no difference between the running and stored configs
and peers work fine.

Ticket: CM-18702
Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed-by: CCR-7004
Testing-done:  Manual testing successful.  L3-smoke has no new failures

7 years agoMerge pull request #1502 from chiragshah6/ospf_vrf_dev
Renato Westphal [Tue, 5 Dec 2017 16:47:53 +0000 (14:47 -0200)]
Merge pull request #1502 from chiragshah6/ospf_vrf_dev

ospfd: Display all vrf aware ospf interface config

7 years agoMerge pull request #1494 from opensourcerouting/u1710-master
Donald Sharp [Tue, 5 Dec 2017 12:03:15 +0000 (07:03 -0500)]
Merge pull request #1494 from opensourcerouting/u1710-master

Fixes to allow Package build on Ubuntu 17.10

7 years agodebianpkg: Update Pkg build instructions with Ubuntu 17.10 and fix errors
Martin Winter [Thu, 30 Nov 2017 03:23:20 +0000 (19:23 -0800)]
debianpkg: Update Pkg build instructions with Ubuntu 17.10 and fix errors

- plus add pointer for creating new backport
- plus add example for customizing package with WANT_* options

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
7 years agodebianpkg: Add Debian backport for Ubuntu 17.10
Martin Winter [Wed, 29 Nov 2017 09:17:28 +0000 (01:17 -0800)]
debianpkg: Add Debian backport for Ubuntu 17.10

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
7 years agodebianpkg: Fix lintian warning "command-with-path-in-maintainer-script"
Martin Winter [Wed, 29 Nov 2017 09:05:46 +0000 (01:05 -0800)]
debianpkg: Fix lintian warning "command-with-path-in-maintainer-script"

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
7 years agolib: Fix gcc 7 warning 'error: ‘fld’ may be used uninitialized in this function'
Martin Winter [Tue, 5 Dec 2017 08:26:41 +0000 (00:26 -0800)]
lib: Fix gcc 7 warning 'error: ‘fld’ may be used uninitialized in this function'

Warning breaks Debian Package build with gcc 7 which uses -Werror=maybe-uninitialized

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
7 years agobgpd: bgp_attr.c GCC 7.0 with --werror needs explicit fall-thru comment
Martin Winter [Wed, 29 Nov 2017 09:29:04 +0000 (01:29 -0800)]
bgpd: bgp_attr.c GCC 7.0 with --werror needs explicit fall-thru comment

Signed-off-by: Martin Winter <mwinter@opensourcerouting.org>
7 years agobgpd: Cleanup unneeded NULL checks.
Donald Sharp [Tue, 5 Dec 2017 02:26:57 +0000 (21:26 -0500)]
bgpd: Cleanup unneeded NULL checks.

All the NULL checks come after previous dereferences.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agowatchfrr: Fail gracefully if fopen fails
Donald Sharp [Tue, 5 Dec 2017 02:26:05 +0000 (21:26 -0500)]
watchfrr: Fail gracefully if fopen fails

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agobgpd, zebra: Use sscanf return value
Donald Sharp [Tue, 5 Dec 2017 01:51:34 +0000 (20:51 -0500)]
bgpd, zebra: Use sscanf return value

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agobgpd: Tell the compiler we don't care about the return code for peer_sort
Donald Sharp [Tue, 5 Dec 2017 01:43:46 +0000 (20:43 -0500)]
bgpd: Tell the compiler we don't care about the return code for peer_sort

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agobgpd: Reorder assignment and assertion.
Donald Sharp [Tue, 5 Dec 2017 01:33:44 +0000 (20:33 -0500)]
bgpd: Reorder assignment and assertion.

If we ever turn off assertion for production builds
this code as written will cause a crash in that
the assignment will not happen.

Modify the code such that this erroneous assumption
cannot happen.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agobabeld: if_eui64 never uses ifname
Donald Sharp [Tue, 5 Dec 2017 00:29:42 +0000 (19:29 -0500)]
babeld: if_eui64 never uses ifname

Remove this variable.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agoospfd: fix crash no router ospf/show running
Chirag Shah [Mon, 4 Dec 2017 22:08:23 +0000 (14:08 -0800)]
ospfd: fix crash no router ospf/show running

no router ospf removes default ospf instance,
if there are other non-default vrf instance present
with interface level configuration. Lookup ospf instance
for ifp->vrf_id, if ospf instnace present use that
   to access 'instance id'.

Ticket: CM-19078
Testing Done:
run no router ospf and show running config along with other
non-default vrf aware ospf configurations.

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
7 years agozebra: Cleanup leaked context information on failure
Donald Sharp [Tue, 5 Dec 2017 00:03:51 +0000 (19:03 -0500)]
zebra: Cleanup leaked context information on failure

When we get a STREAM_GET failure of some sort we
need to handle the failure case here and safely
free up stored memory/context and return gracefully.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agolib: Allow memory to be cleaned up for error cases in ptm
Donald Sharp [Mon, 4 Dec 2017 23:59:47 +0000 (18:59 -0500)]
lib: Allow memory to be cleaned up for error cases in ptm

ptm_lib.c had no way to cleanup after itself when an
error was detected.  This adds a function to cleanup
context in such a case.

A followup commit will use this new functionality.

Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agoospfd: Display all vrf aware interface config
Chirag Shah [Mon, 27 Nov 2017 21:59:19 +0000 (13:59 -0800)]
ospfd: Display all vrf aware interface config

OSPF interface specific configuration can be done independent
of router ospf [vrf x] global config.
In cases where ospf interface non default vrf configuration
is done prior to 'router ospf vrf x', show running-config
would not display such configuration.

To display configuration now walk all vrfs and interface list
and only display where OSPF configure params are set.

Ticket:CM-18952
Testing Done:
Tried ospf interface specific configuration with VRF,
where router ospf vrf x is not present.

Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
7 years agoMerge pull request #1518 from opensourcerouting/clippy-issues
Donald Sharp [Mon, 4 Dec 2017 22:45:05 +0000 (17:45 -0500)]
Merge pull request #1518 from opensourcerouting/clippy-issues

*: make clippy usage more consistent

7 years ago*: make clippy usage more consistent
Renato Westphal [Mon, 4 Dec 2017 21:32:20 +0000 (19:32 -0200)]
*: make clippy usage more consistent

Fixes #1511.

Signed-off-by: Renato Westphal <renato@opensourcerouting.org>
7 years agoMerge pull request #1496 from donaldsharp/install_failure
Renato Westphal [Mon, 4 Dec 2017 20:25:16 +0000 (18:25 -0200)]
Merge pull request #1496 from donaldsharp/install_failure

Additional Southbound API changes

7 years agoMerge pull request #1507 from donaldsharp/bgp_af_open
Renato Westphal [Mon, 4 Dec 2017 19:34:19 +0000 (17:34 -0200)]
Merge pull request #1507 from donaldsharp/bgp_af_open

bgpd: Allow Address-Family activation to work in certain states

7 years agotools, watchfrr: Modify timeout to 90 seconds
Brian Rak [Wed, 15 Nov 2017 19:51:37 +0000 (14:51 -0500)]
tools, watchfrr: Modify timeout to 90 seconds

The default timeout of 10 seconds is too quick of a timeout
given some long running cli commands.  Modify watchfrr
to have a 90s timeout value instead.

Signed-off-by: Brian Rak <brianrak@gameservers.com>
7 years agoMerge pull request #1500 from opensourcerouting/ldpd-fixes
Donald Sharp [Mon, 4 Dec 2017 14:06:09 +0000 (09:06 -0500)]
Merge pull request #1500 from opensourcerouting/ldpd-fixes

ldpd: small improvements

7 years agoMerge pull request #1508 from qlyoung/bgpd-fix-lock
Rafael Zalamena [Mon, 4 Dec 2017 13:16:45 +0000 (11:16 -0200)]
Merge pull request #1508 from qlyoung/bgpd-fix-lock

bgpd: fix potential deadlock

7 years agoMerge pull request #1472 from opensourcerouting/lintian-warning
Donald Sharp [Mon, 4 Dec 2017 13:02:16 +0000 (08:02 -0500)]
Merge pull request #1472 from opensourcerouting/lintian-warning

debianpkg: Suppress frr-dbg debug-file-with-no-debug-symbols warning

7 years agoMerge pull request #1510 from qlyoung/ospf-gitignore-clippy
Lou Berger [Fri, 1 Dec 2017 22:06:37 +0000 (06:06 +0800)]
Merge pull request #1510 from qlyoung/ospf-gitignore-clippy

ospfd: remove clippy file, fix .gitignore

7 years agoMerge pull request #1433 from qlyoung/remove-deprecated-stream-macros
Rafael Zalamena [Fri, 1 Dec 2017 19:46:02 +0000 (17:46 -0200)]
Merge pull request #1433 from qlyoung/remove-deprecated-stream-macros

*: don't use deprecated stream.h macros

7 years agoospfd: remove clippy file, fix .gitignore
Quentin Young [Fri, 1 Dec 2017 19:24:30 +0000 (14:24 -0500)]
ospfd: remove clippy file, fix .gitignore

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years ago*: don't use deprecated stream.h macros
Quentin Young [Wed, 8 Nov 2017 17:51:16 +0000 (12:51 -0500)]
*: don't use deprecated stream.h macros

Some of the deprecated stream.h macros see such little use that we may
as well just remove them and use the non-deprecated macros.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: fix potential deadlock
Quentin Young [Fri, 1 Dec 2017 18:41:27 +0000 (13:41 -0500)]
bgpd: fix potential deadlock

With the way things are set up, this bit of code would never actually
cause a deadlock, but would be highly likely in the future.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: Allow Address-Family activation to work in certain states
Donald Sharp [Fri, 1 Dec 2017 16:49:13 +0000 (11:49 -0500)]
bgpd: Allow Address-Family activation to work in certain states

If we are in OpenSent or OpenConfirm peer state and we receive a new
address-family activation, we would end up ignoring the new activation
and not tell our peer about it.  You could notice this by seeing
the fact that a 'show bgp neighbor' command returns a 'Not in
any update group' for a particular family.

This modifies the code such that we now notice that we are in
either OpenSent or OpenConfirm state and reset the peer to
allow us to send them the new capability.

Ticket: CM-19021
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
7 years agoospfd: fix NSSA LSA translation (BZ#493) (BZ#250)
Svata Dedic [Thu, 22 Dec 2011 14:07:15 +0000 (18:07 +0400)]
ospfd: fix NSSA LSA translation (BZ#493) (BZ#250)

7 years agoMerge pull request #1145 from qlyoung/bgpd-pthreads-frr
Martin Winter [Fri, 1 Dec 2017 07:35:51 +0000 (23:35 -0800)]
Merge pull request #1145 from qlyoung/bgpd-pthreads-frr

Multithreaded BGPD

7 years agobgpd: small optimization with UPDATE generation
Quentin Young [Thu, 30 Nov 2017 22:16:37 +0000 (17:16 -0500)]
bgpd: small optimization with UPDATE generation

After a batch of generated UPDATEs, call bgp_writes_on() once instead of
after generating each packet.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: use FOREACH_AFI_SAFI()
Quentin Young [Thu, 30 Nov 2017 21:58:37 +0000 (16:58 -0500)]
bgpd: use FOREACH_AFI_SAFI()

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: intelligently adjust coalesce timer
Quentin Young [Thu, 30 Nov 2017 19:11:12 +0000 (14:11 -0500)]
bgpd: intelligently adjust coalesce timer

The subgroup coalesce timer controls how long updates to a particular
subgroup are delayed in order to allow additional peers to join the
subgroup. Presently the timer value is 200 ms. Increase it to 1 second
and adjust up as peers are configured, with an upper cap at 10s.

This cuts convergence time by a factor of 3 at large scale (300+ peers,
1000+ prefixes per peer).

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agotests: neuter fuzzing frontend for now
Quentin Young [Thu, 30 Nov 2017 20:07:29 +0000 (15:07 -0500)]
tests: neuter fuzzing frontend for now

Fuzzing hook for BGP packet processing does not map to MT-BGPD. Removing
offending call for now, additional work to fix this in the future.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: turn off keepalives when sending NOTIFY
Quentin Young [Mon, 13 Nov 2017 22:59:04 +0000 (17:59 -0500)]
bgpd: turn off keepalives when sending NOTIFY

This is necessary because otherwise between the time we wipe the output
buffer and the time we push the NOTIFY onto it, the KA generation thread
could have pushed a KEEPALIVE in the middle.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: yield more when generating UPDATEs
Quentin Young [Mon, 13 Nov 2017 08:18:49 +0000 (03:18 -0500)]
bgpd: yield more when generating UPDATEs

In the same vein as the round-robin input commit, this re-adds logic for
limiting the amount of time spent generating UPDATEs per generation
cycle. Missed this when shifting around wpkt_quanta; prior to MT it
limited both calls to write() as well as UPDATE generation.

7 years agobgpd: schedule UPDATE generation smarter
Quentin Young [Fri, 10 Nov 2017 22:03:58 +0000 (17:03 -0500)]
bgpd: schedule UPDATE generation smarter

No need to schedule a job to generate more packets until we're done with
the ones we've got. Shaves a few percent off convergence time.

7 years agobgpd: restore packet input limit
Quentin Young [Fri, 10 Nov 2017 21:42:49 +0000 (16:42 -0500)]
bgpd: restore packet input limit

Unfortunately, batching input processing severely impacts BGP initial
convergence times. As a consequence of the way update-groups were
implemented, advancing the state of the routing table based on prefixes
learned from one peer prior to all (or at least most) peers establishing
connections will cause us to start generating outbound UPDATEs, which is
a very expensive operation at present. This intensive processing starves
out bgp_accept(), delaying connection of additional peers. When
additional peers do connect the problem gets worse and worse, yielding
approximately exponential growth in convergence time dependent on both
peering and prefix counts. This behavior is present pre-multithreading
as well, but batched input exacerbates it.

Round-robin input processing marginally harms convergence times for
small topologies but should allow much larger topologies to function
within reasonable performance thresholds.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: schedule process packet as timer
Quentin Young [Tue, 7 Nov 2017 07:49:54 +0000 (02:49 -0500)]
bgpd: schedule process packet as timer

Different places scheduling the same thread should use the same
semantics and thread type. Additionally providing the back reference
here makes sure we only schedule the job once and avoids flooding the
event queue with jobs to process an empty buffer.

7 years agobgpd: re-add write trigger logic
Quentin Young [Mon, 6 Nov 2017 19:15:36 +0000 (14:15 -0500)]
bgpd: re-add write trigger logic

Apparently I didn't fully understand how subgroup packets make their way
out to individual peers. Turns out (on the base branch) we just busy
poll while waiting for packets to make their way onto subgroup queues.
While this needs to be fixed in the future, for now readding this logic
fixes performance issues with convergence.

7 years agobgpd: properly set peer->last_update
Quentin Young [Mon, 6 Nov 2017 06:41:27 +0000 (01:41 -0500)]
bgpd: properly set peer->last_update

Instead of checking whether the post-write number of updates sent was
greater than the pre-write number of updates sent, it was comparing post
to zero. In effect this meant every time we wrote a packet it was
counted as an update for route advertisement timer purposes.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: schedule packet job after connection xfer
Quentin Young [Mon, 6 Nov 2017 05:33:46 +0000 (00:33 -0500)]
bgpd: schedule packet job after connection xfer

During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input and a
reorganized packet processor, the following race condition manifests:

1. Remote peer initiates a connection. After exchanging OPEN messages,
   we send them a KEEPALIVE. They send us a KEEPALIVE followed by
   10,000 UPDATE messages. The I/O thread pushes these onto our local
   peer's input buffer and schedules a packet processing job on the
   main thread.
2. The packet job runs and processes the KEEPALIVE, which completes the
   handshake on our end. As part of transferring to ESTABLISHED we
   transfer all peer state to a new struct, as mentioned. Upon returning
   from the KEEPALIVE processing routing, the peer context we had has
   now been destroyed. We notice this and stop processing. Meanwhile
   10k UPDATE messages are sitting on the input buffer.
3. N seconds later, the remote peer sends us a KEEPALIVE. The I/O thread
   schedules another process job, which finds 10k UPDATEs waiting for
   it. Convergence is achieved, but has been delayed by the value of the
   KEEPALIVE timer.

The racey part is that if the remote peer takes a little bit of time to
send UPDATEs after KEEPALIVEs -- somewhere on the order of a few hundred
milliseconds -- we complete the transfer successfully and the packet
processing job is scheduled on the new peer upon arrival of the UPDATE
messages. Yuck.

The solution is to schedule a packet processing job on the new peer
struct after transferring state.

Lengthy commit message in case someone has to debug similar problems in
the future...

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: transfer raw input buffer to new peer
Quentin Young [Fri, 3 Nov 2017 18:47:56 +0000 (14:47 -0400)]
bgpd: transfer raw input buffer to new peer

During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input, I forgot
to transfer the raw input buffer to the new peer. This resulted in
infrequent failures during session handshaking whereby half of a packet
would be thrown away in the middle of a read causing us to send a NOTIFY
for an unsynchronized header. Usually the transfer coincided with a
clean input buffer, hence why it only showed up once in a while.

7 years agobgpd: fix bgp active open
Quentin Young [Mon, 25 Sep 2017 02:18:15 +0000 (22:18 -0400)]
bgpd: fix bgp active open

At some point when rearranging FSM code, bgpd lost the ability to
perform active opens because it was only paying attention to POLLIN and
not POLLOUT, when the latter is used to signify a successful connection
in the active case.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: use correct byte order for notify data
Quentin Young [Wed, 20 Sep 2017 15:11:30 +0000 (11:11 -0400)]
bgpd: use correct byte order for notify data

Broke this when rewriting header validation.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agotests: add name to test_mp_attr threadmaster
Quentin Young [Fri, 8 Sep 2017 16:58:59 +0000 (12:58 -0400)]
tests: add name to test_mp_attr threadmaster

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd, tests: comment formatting
Quentin Young [Fri, 8 Sep 2017 15:51:12 +0000 (11:51 -0400)]
bgpd, tests: comment formatting

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: fix some formatting in bgp_io.c
Quentin Young [Fri, 4 Aug 2017 18:27:42 +0000 (14:27 -0400)]
bgpd: fix some formatting in bgp_io.c

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: update atomic memory orders
Quentin Young [Wed, 5 Jul 2017 15:38:57 +0000 (11:38 -0400)]
bgpd: update atomic memory orders

Use best-performing memory orders where appropriate.
Also update some style and add missing comments.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: rebase onto master
Quentin Young [Fri, 30 Jun 2017 18:04:32 +0000 (18:04 +0000)]
bgpd: rebase onto master

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: static bgp_pthreads_init()
Quentin Young [Mon, 26 Jun 2017 16:29:20 +0000 (16:29 +0000)]
bgpd: static bgp_pthreads_init()

got un-static'd at some point

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: fix uninitialized result code
Quentin Young [Mon, 26 Jun 2017 15:50:35 +0000 (15:50 +0000)]
bgpd: fix uninitialized result code

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: sleep in poll()
Quentin Young [Fri, 16 Jun 2017 20:15:31 +0000 (20:15 +0000)]
bgpd: sleep in poll()

poll won't sleep if there are no file descriptors! gotta sleep!

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: lift read-quanta restriction
Quentin Young [Tue, 13 Jun 2017 19:06:51 +0000 (19:06 +0000)]
bgpd: lift read-quanta restriction

Per previous work to ensure all FSM state is updated after processing
each message, read-quanta should be safe to set > 1.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: remove unused extern from bgp_io.h
Quentin Young [Tue, 13 Jun 2017 01:58:39 +0000 (01:58 +0000)]
bgpd: remove unused extern from bgp_io.h

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: be more promiscuous with updgrp packets
Quentin Young [Mon, 12 Jun 2017 21:16:40 +0000 (21:16 +0000)]
bgpd: be more promiscuous with updgrp packets

Slightly incorrect trigger for generating update group packets. In order
to match semantics of previous bgp_write() we need to trigger
update-group packet generation after every write operation, even if no
packets were written. Of course if we're tearing down the session we can
still skip this operation.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: re-add update-group write triggers
Quentin Young [Mon, 12 Jun 2017 20:20:50 +0000 (20:20 +0000)]
bgpd: re-add update-group write triggers

Removed in earlier version where the I/O pthread busy-waited for packets
to be posted to an output queue. Now that it's poll()-based, it's
necessary once again. Although this time we can say what we're actually
doing instead of a side effect of a write job.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agotests: update tests for bgp_packet changes
Quentin Young [Mon, 12 Jun 2017 17:35:47 +0000 (17:35 +0000)]
tests: update tests for bgp_packet changes

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: free notify packet after writing
Quentin Young [Mon, 12 Jun 2017 06:46:56 +0000 (06:46 +0000)]
bgpd: free notify packet after writing

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: misc fsm fixes
Quentin Young [Mon, 12 Jun 2017 02:53:42 +0000 (02:53 +0000)]
bgpd: misc fsm fixes

* Keepalive on/off calls are necessary in certain cases due to screwy
  fsm flow not turning them on after transferring a passive peer
  connection in peer_xfer_conn

* Missed a case bgp_event_update() that resulted in a return code of -1
  instead of BGP_Stop, which confuses the packet processing routine

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: fix bgp_packet.c / bgp_fsm.c organization
Quentin Young [Sat, 10 Jun 2017 01:01:56 +0000 (01:01 +0000)]
bgpd: fix bgp_packet.c / bgp_fsm.c organization

Despaghettification of bgp_packet.c and bgp_fsm.c

Sometimes we call bgp_event_update() inline packet parsing.
Sometimes we post events instead.
Sometimes we increment packet counters in the FSM.
Sometimes we do it in packet routines.
Sometimes we update EOR's in FSM.
Sometimes we do it in packet routines.

Fix the madness.

bgp_process_packet() is now the centralized place to:
- Update message counters
- Execute FSM events in response to incoming packets

FSM events are now executed directly from this function instead of being
queued on the thread_master. This is to ensure that the FSM contains the
proper state after each packet is parsed. Otherwise there could be race
conditions where two packets are parsed in succession without the
appropriate FSM update in between, leading to session closure due to
receiving inappropriate messages for the current FSM state.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: fix includes for bgp_keeaplives.c
Quentin Young [Fri, 9 Jun 2017 19:34:29 +0000 (19:34 +0000)]
bgpd: fix includes for bgp_keeaplives.c

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: restyle bgp_keepalives.[ch]
Quentin Young [Fri, 9 Jun 2017 19:22:34 +0000 (19:22 +0000)]
bgpd: restyle bgp_keepalives.[ch]

And update copyright header.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: use stop event instead of pthread_kill()
Quentin Young [Fri, 9 Jun 2017 18:10:59 +0000 (18:10 +0000)]
bgpd: use stop event instead of pthread_kill()

When terminating I/O thread, just schedule an event to do any necessary
cleanup and gracefully exit instead of using a signal.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: update I/O docs
Quentin Young [Thu, 8 Jun 2017 21:47:33 +0000 (21:47 +0000)]
bgpd: update I/O docs

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: restyle
Quentin Young [Thu, 8 Jun 2017 21:25:23 +0000 (21:25 +0000)]
bgpd: restyle

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: small i/o threading improvements
Quentin Young [Thu, 8 Jun 2017 21:14:18 +0000 (21:14 +0000)]
bgpd: small i/o threading improvements

* Start bit flags at 1, not 2
* Make run-flags atomic for i/o thread
* Remove work_cond mutex, it should no longer be necessary
* Add asserts to ensure proper ordering in bgp_connect()
* Use true/false with booleans, not 1/0

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: bye bye THREAD_BACKGROUND
Quentin Young [Thu, 8 Jun 2017 20:41:21 +0000 (20:41 +0000)]
bgpd: bye bye THREAD_BACKGROUND

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: use mt-safe thread_cancel()
Quentin Young [Wed, 7 Jun 2017 21:29:48 +0000 (21:29 +0000)]
bgpd: use mt-safe thread_cancel()

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: set thread_master owner appropriately
Quentin Young [Wed, 7 Jun 2017 21:09:59 +0000 (21:09 +0000)]
bgpd: set thread_master owner appropriately

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: atomize write-quanta, add read-quanta
Quentin Young [Mon, 5 Jun 2017 20:14:47 +0000 (20:14 +0000)]
bgpd: atomize write-quanta, add read-quanta

bgpd supports setting a write-quanta that serves as a hint on how many
packets to write per I/O cycle. Now that input is buffered, it makes
sense to add the equivalent parameter for how many packets are processed
per cycle. This is *not* how many packets are read off the wire per I/O
cycle; rather it is how many packets are processed from the input buffer
in a given cycle after having been read off the wire and sanitized.

Since these values must be used from multiple threads, they have also
been made atomic.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: batched i/o
Quentin Young [Fri, 2 Jun 2017 01:52:39 +0000 (01:52 +0000)]
bgpd: batched i/o

Instead of reading a packet header and the rest of the packet in two
separate i/o cycles, instead read a chunk of data at one time and then
parse as many packets as possible out of the chunk.

Also changes bgp_packet.c to batch process packets.

To avoid thrashing on useless mutex locks, the scheduling call for
bgp_process_packet has been changed to always succeed at the cost of no
longer being cancel-able. In this case this is acceptable; following the
pattern of other event-based callbacks, an additional check in
bgp_process_packet to ignore stray events is sufficient. Before deleting
the peer all events are cleared which provides the requisite ordering.

XXX: chunk hardcoded to 5, should use something similar to wpkt_quanta

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: fix includes for bgp_io.c
Quentin Young [Thu, 1 Jun 2017 16:44:02 +0000 (16:44 +0000)]
bgpd: fix includes for bgp_io.c

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: style for bgp i/o
Quentin Young [Thu, 1 Jun 2017 16:26:49 +0000 (16:26 +0000)]
bgpd: style for bgp i/o

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: use memcmp to check bgp marker
Quentin Young [Thu, 1 Jun 2017 16:20:58 +0000 (16:20 +0000)]
bgpd: use memcmp to check bgp marker

performance

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: copyright style
Quentin Young [Wed, 17 May 2017 17:17:18 +0000 (17:17 +0000)]
bgpd: copyright style

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: rename peer_keepalives* --> bgp_keepalives*
Quentin Young [Fri, 12 May 2017 03:54:18 +0000 (03:54 +0000)]
bgpd: rename peer_keepalives* --> bgp_keepalives*

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: implement buffered reads
Quentin Young [Tue, 2 May 2017 00:37:45 +0000 (00:37 +0000)]
bgpd: implement buffered reads

* Move and modify all network input related code to bgp_io.c
* Add a real input buffer to `struct peer`
* Move connection initialization to its own thread.c task instead of
  piggybacking off of bgp_read()
* Tons of little fixups

Primary changes are in bgp_packet.[ch], bgp_io.[ch], bgp_fsm.[ch].
Changes made elsewhere are almost exclusively refactoring peer->ibuf to
peer->curr since peer->ibuf is now the true FIFO packet input buffer
while peer->curr represents the packet currently being processed by the
main pthread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: move bgp i/o to a separate source file
Quentin Young [Tue, 18 Apr 2017 18:11:43 +0000 (18:11 +0000)]
bgpd: move bgp i/o to a separate source file

After implement threading, bgp_packet.c was serving the double purpose
of consolidating packet parsing functionality and handling actual I/O
operations. This is somewhat messy and difficult to understand. I've
thus moved all code and data structures for handling threaded packet
writes to bgp_io.[ch].

Although bgp_io.[ch] only handles writes at the moment to keep the noise
on this commit series down, for organization purposes, it's probably
best to move bgp_read() and its trappings into here as well and
restructure that code so that read()'s happen in the pthread and packet
processing happens on the main thread.

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
7 years agobgpd: use new threading infra
Quentin Young [Sun, 16 Apr 2017 05:18:07 +0000 (05:18 +0000)]
bgpd: use new threading infra

Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>