During initial session establishment, bgpd performs a "connection
transfer" to a new peer struct if the connection was initiated passively
(i.e. by the remote peer). With the addition of buffered input and a
reorganized packet processor, the following race condition manifests:
1. Remote peer initiates a connection. After exchanging OPEN messages,
we send them a KEEPALIVE. They send us a KEEPALIVE followed by
10,000 UPDATE messages. The I/O thread pushes these onto our local
peer's input buffer and schedules a packet processing job on the
main thread.
2. The packet job runs and processes the KEEPALIVE, which completes the
handshake on our end. As part of transferring to ESTABLISHED we
transfer all peer state to a new struct, as mentioned. Upon returning
from the KEEPALIVE processing routing, the peer context we had has
now been destroyed. We notice this and stop processing. Meanwhile
10k UPDATE messages are sitting on the input buffer.
3. N seconds later, the remote peer sends us a KEEPALIVE. The I/O thread
schedules another process job, which finds 10k UPDATEs waiting for
it. Convergence is achieved, but has been delayed by the value of the
KEEPALIVE timer.
The racey part is that if the remote peer takes a little bit of time to
send UPDATEs after KEEPALIVEs -- somewhere on the order of a few hundred
milliseconds -- we complete the transfer successfully and the packet
processing job is scheduled on the new peer upon arrival of the UPDATE
messages. Yuck.
The solution is to schedule a packet processing job on the new peer
struct after transferring state.
Lengthy commit message in case someone has to debug similar problems in
the future...
Signed-off-by: Quentin Young <qlyoung@cumulusnetworks.com>
BGP_TIMER_OFF(from_peer->t_connect);
BGP_TIMER_OFF(from_peer->t_connect_check_r);
BGP_TIMER_OFF(from_peer->t_connect_check_w);
+ BGP_TIMER_OFF(from_peer->t_process_packet);
/*
* At this point in time, it is possible that there are packets pending
bgp_reads_on(peer);
bgp_writes_on(peer);
+ thread_add_timer_msec(bm->master, bgp_process_packet, peer, 0,
+ &peer->t_process_packet);
if (from_peer)
peer_xfer_stats(peer, from_peer);
thread_add_read(fpt->master, bgp_process_reads, peer, peer->fd,
&peer->t_read);
if (added_pkt)
- thread_add_event(bm->master, bgp_process_packet, peer,
- 0, NULL);
+ thread_add_timer_msec(bm->master, bgp_process_packet,
+ peer, 0, &peer->t_process_packet);
}
return 0;