Donald Sharp [Wed, 20 May 2015 01:04:20 +0000 (18:04 -0700)]
bgpd-nht-import-check-fix.patch
BGP: Fix network import check use with NHT instead of scanner
When next hop tracking was implemented and the bgp scanner was eliminated,
the "network import-check" command got broken. This patch fixes that
issue. NHT is used to not just track nexthops, but also the static routes
that are announced as part of BGP's network command. The routes are
registered only when import-check is enabled. To optimize performance,
we register static routes only when import-check is enabled.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:04:19 +0000 (18:04 -0700)]
During connection setup, there may be two connections in progress for a BGP
peer - one initiated by the local system and the other initiated by the peer.
Enhance key debug logs to also print the socket file descriptor so that it is
clear which events pertain to which connection.
Donald Sharp [Wed, 20 May 2015 01:04:18 +0000 (18:04 -0700)]
When a peer is unbound from its peer-group, in some situations the peer is
deleted while in other situations, the peer continues to exist but its
global flags have all been reset. This is incorrect, particularly for the
CONFIG_NODE flag as other parts of the code depend on this flag being set
for a configured peer. This patch ensures that the correct flags still
remain set for the peer after unbind from its peer-group.
Donald Sharp [Wed, 20 May 2015 01:04:17 +0000 (18:04 -0700)]
The retry of BGP connection after expiry of connect retry timer was
broken by some earlier patches. Instead of staying in Connect state
after reattempting the connection, the state used to go back to Idle
and then try to connect. This patch fixes this error.
Donald Sharp [Wed, 20 May 2015 01:04:16 +0000 (18:04 -0700)]
Zebra: Don't resolve routes over default for nexthop tracking
Resolving routes over the default route for NHT can lead to all sorts
of problems. So, we explicitly exclude resolving routes for NHT over the
default route. A knob is provided to allow the route to be resolved over
the default in case of special circumstances.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:04:16 +0000 (18:04 -0700)]
Zebra: Ensure we compare prefix and NHs when checking if NH changed
In nexthop tracking, the code currently compares the nexthop state of the
resolved_route for a prefix with the previous nexthop state. However, if
the resolved route itself changes, we can end up comparing the RIBs of
unrelated prefixes and assuming that nothing has changed. To fix this, we
need to store and compare the new resolved route with the previously
resolved route. If this has changed, assume the NH associated with a route
has changed.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com> Reviewed-by: Vivek Venkataraman <vivek@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:04:15 +0000 (18:04 -0700)]
Zebra: Static NHT fixes
When NHT calls rib_process() to be invoked for a prefix, the RIB has already
been marked as having NH changes. The first call to nexthop_active_update
clears this flag and attempts to re-determine if there are any NH changes for
a prefix. However, when the NH is recurisve, this fails. Furthermore, since
NHT has already determined that this RIB has NH changes, there's no need to
ascertain that again. The original patch used static route as the proxy to
skip this call which was incorrect since rib_process can be invoked for
static routes for reasons other than NHT. So, this patch removes the check
for static route and directly checks if the NH changed flag has been set.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com> Reviewed-by: Vivek Venkataraman <vivek@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:04:15 +0000 (18:04 -0700)]
ospfd: ospf_cli_fixes
ospf: Fix cli issues with timers throttle spf and no ip ospf authentication...
When entering no timers throttle spf there was no way to specify the delay, hold
time and max hold time so the command was rejected. This is useful for automated
processes that take currently entered cli to remove the cli.
When entering no ip ospf authentication most forms of the command were being
ignored, this fixes that as well.
Signed-off-by: Donald Sharp <sharpd at cumulusnetworks.com>
Reviewed-by:
Donald Sharp [Wed, 20 May 2015 01:04:14 +0000 (18:04 -0700)]
When an incoming connection is received from a neighbor that is configured but
is not activated for any address-family, the connection is accepted without
taking further action. This causes the connection to hang in OpenSent on the
neighbor and can in turn delay the connection setup. Fix to reject incoming
connections when there is no address-family activated for the neighbor.
Donald Sharp [Wed, 20 May 2015 01:04:13 +0000 (18:04 -0700)]
initd-status.patch
Add support for service quagga status.
As per LSB initscript status code definitions, support is added for
querying status of quagga. All daemons supposed to have been enabled, will
be checked as running and if any one of them is found to be not running, the
appropriate status code is returned.
Note that if watchquagga is running, a status indicating a problem maybe a
trasient problem because watchquagga will start back an unresponsive or dead
process.
http://refspecs.linuxbase.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
Donald Sharp [Wed, 20 May 2015 01:04:13 +0000 (18:04 -0700)]
zebra-rtadv-suppress-default-config.patch
Zebra: Suppress displaying default config as part of running config
Quagga doesn't display default config as part of the running config, only
what is different from the default. However, in the case of rtadv, every
link displays the default "ipv6 nd suppress-ra" as part of running config.
This patch fixes that.
Donald Sharp [Wed, 20 May 2015 01:04:12 +0000 (18:04 -0700)]
When a peer that is Established goes down, it is moved into the Clearing
state to facilitate clearing of the routes received from the peer - remove
from the RIB, reselect best path, update/delete from Zebra and to other
peers etc. At the end of this, a Clearing_Completed event is generated to
the FSM which will allow the peer to move out of Clearing to Idle.
The issue in the code is that there is a possibility of multiple Clearing
Completed events being generated for a peer, one per AFI/SAFI. Upon the
first such event, the peer would move to Idle. If other events happened
(e.g., new connection got established) before the last Clearing_Completed
event is received, bad things can happen.
Fix to ensure only one Clearing_Completed event is generated.
Donald Sharp [Wed, 20 May 2015 01:04:12 +0000 (18:04 -0700)]
When unexpected events are received, do not silently transition to Idle
state through bgp_ignore() as that may not do required cleanup. Instead,
define a new event handler to handle such cases, which will go through
bgp_stop(). A similar change is also done to handle the case where an
event handler fails.
Also add a couple of variables to keep track of events for a peer.
Donald Sharp [Wed, 20 May 2015 01:04:11 +0000 (18:04 -0700)]
initd-reload.patch
init.d: Add reload option
Add an option to apply only modifications to running configuration from the
specified configuration file. The default modification file is
/etc/quagga/Quagga.conf. A new script, quagga-reload.py, has been added to
the tools directory.
Donald Sharp [Wed, 20 May 2015 01:04:11 +0000 (18:04 -0700)]
vtysh-add-mark-cmd.patch
VTYSH: Add support for marking a file with appropriate end of context
To support applying only differences to the existing config, this patch
enables supplying the appropriate end markers to a provided file (or
stdin). By end markers, I mean, adding "end" and "exit-address-family"
at the appropriate places in the configuration to ease finding the
differences with the running configuration.
Donald Sharp [Wed, 20 May 2015 01:04:10 +0000 (18:04 -0700)]
Zebra: Fix multiple RNH deletes
The code is structured in a way that ends up invoking zebra_delete_rnh()
multiple times which can lead to crashes and asserts. This patch fixes
the issue by setting a flag when an RNH structure is being deleted and
ignores any further attempts to delete the structure.
Donald Sharp [Wed, 20 May 2015 01:04:10 +0000 (18:04 -0700)]
Zebra: Add onlink attribute even for recursive routes
When a route is resolved recursively, and the recursively resolved nexthop
has the onlink attribute, the route is not programmed with the nexthop with
the onlink attribute. This patch addresses that.
Donald Sharp [Wed, 20 May 2015 01:04:09 +0000 (18:04 -0700)]
BGP: Fix update-groups commands to match neighbors
show update-groups summary was mislabeled. What it displays is not a summary
at all, but the detailed info about all update-groups. Furthermore, there
was no way to get detailed info about a specific subgroup.
This patch renames "show * update-groups summary" to "show * update-groups"
and adds an option to see the info specific to a subgroup only. It also
validates the subgroup-id.
show * update-groups summary will be added separately.
Donald Sharp [Wed, 20 May 2015 01:04:08 +0000 (18:04 -0700)]
Cleanup some code related to NHT.
When BGP connection setup was moved to rely on nexthop tracking, a few silly
bugs were introduced.
- bgp_connect_check() was called unnecessarily which resulted in false
positives which resulted in log messages indicating an error and the FSM
was unnecessarily reset.
- When routes to peer disappeared, and the peer was not directly connected,
the session was not immediately torn down, but only on hold timer expiry.
- When NHT indicated that route to session IP addr was available, the previous
state was not reset and as a result, connect retry timer had to expire
before a reconnection was attempted.
- connected check MUST be enabled only for EBGP non-multihop sessions and
only if disbale-connected-check option is not enabled.
Donald Sharp [Wed, 20 May 2015 01:04:07 +0000 (18:04 -0700)]
Changing router-id inline isnt handled correctly in the current implementation.
At the minimum, the OSPF_LSA_SELF logic isnt foolproof, and it may hit assert
in ospf_refresh_unregister_lsa on a router-id change.
Once OSPF has created and flooded LSAs, its not a good idea to change
router-id inline. Tying it to restart has at least two benefits:
- Implementation can remain sane by not having to re-adjust neighbors and LSAs,
based on the new router-id.
- Works as a deterrent for the user to not meddle with the router-id unless
really needed.
Donald Sharp [Wed, 20 May 2015 01:04:04 +0000 (18:04 -0700)]
Remove incorrect call to delete NHT for a route added via "network" command.
When a route is announced in BGP via "network" command, we also register its
next hop with NHT code to allow of updates when the nexthop changes. When this
route is deleted via "no network" command, we incorrectly make a second call to
unregister the NHT tracking associated with this route. This causes a crash.
Fix that.
Donald Sharp [Wed, 20 May 2015 01:04:00 +0000 (18:04 -0700)]
Ensure that during event-driven route-map processing, the peer status is
considered, if required. Attempting to do certain processing while the
peer is not Established can lead to errors.
Donald Sharp [Wed, 20 May 2015 01:03:59 +0000 (18:03 -0700)]
If on-shutdown is configured to a large value and 'service quagga restart'
is executed, then the init.d/quagga script doesnt wait more than 120 seconds
for the daemon do stop, worse, it goes ahead and starts the new daemon
regardless. This can result into two ospfd processes running on the same config.
Which leads to many issues including but not limited to high cpu usage.
Thats because the two processes are mixing packets on adjancencies thus
causing churn on the box and network.
As long as OSPF is able to reliably send the max-metric router-lsa before
exiting thats mostly good enough for this purpose anyways.
As a solution to this situation, bringing the maximum configurable value of
the on-shutdown timer below the maximum retry to stop a daemon in init.d/quagga
Notes: This may not be an upstreamable patch, still we needed to find
a solution for init.d/quagga and this command this co-exist.
Donald Sharp [Wed, 20 May 2015 01:03:55 +0000 (18:03 -0700)]
bgpd-ensure-fast-eor-send.patch
BGP: Ensure EOR is always sent immediately after all prefixes have been adv.
Its possible that EOR send is delayed until the next KeepAlive timer fires.
This can happen when the send update iteration precisely matches the last
update packet sent. After this since there are no more updates to be sent,
no write thread is setup, but there's still the EOR to be sent. Therefore,
EOR is not sent right away causing some neighbors to not exit RO mode and
delaying convergence overall. This patch ensures that EOR is sent at the end
of all updates on startup.
Signed-off-by: Vivek Venkataraman <vivek@cumulusnetworks.com> Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:54 +0000 (18:03 -0700)]
Ensure that if 'update-source <interface>' is specified, that interface is
chosen as the source for the local nexthops. Otherwise, do a complete
match on the local IP address of the connection to determine the source
interface for the local nexthops; this will handle scenarios where there
is an overlap of subnets between interfaces (e.g., loopback and another
interface).
Donald Sharp [Wed, 20 May 2015 01:03:53 +0000 (18:03 -0700)]
Fixing a couple of issues with ospf6_route_remove () routine.
When a route_node has multiple ospf6_routes under it (common subnet case),
then the current implementation has an issue in adjusting the route_node->info
on a ospf6_route_remove() call.
The main reason is that it ends up using exact match to determine if the next
ospf6_route belongs to the same route_node or not. Fixing that part to use
rnode (the existing back-pointer to the route_node) from the ospf6_route to
determine that.
Also fixing some of the walks to turn them safe so that the route deletion is
fine.
Donald Sharp [Wed, 20 May 2015 01:03:52 +0000 (18:03 -0700)]
Process and/or announce existing routes when a prefix-list, distribute-
list or filter-list is applied (added or removed) against a neighbor or
peer group. This makes the behavior inline with other configuration changes
such as add or remove of route-map against a neighbor or change of other
settings such as next-hop-self or as-override.
Donald Sharp [Wed, 20 May 2015 01:03:51 +0000 (18:03 -0700)]
LA (local-address) bit related inter-op fix.
As per the RFC, when the NU bit is set, prefix should be ignored.
However, the code is currently ignoring prefix with LA bit too.
Fixing that part.
In future, we should also set LA bit for the loopback addresses. Not doing this
part right away, as quagga wont be backward compatible with its own previous
releases. Maybe after a release or so, we should start setting LA bit too.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:51 +0000 (18:03 -0700)]
Ensure that routes from a peer are not considered for best path
comparison if the peer is not in an Established state. There can
be a window between a peer being deleted and the background
thread that actually clears the routes (marks them as "removed")
runs during which best path may run. If this path selection
compared two prefixes all the way down to peer IP addresses and
one of these two peers had just been deleted, that peer would
not have its sockunion structures, especially su_remote, resulting
in a BGPD exception.
Donald Sharp [Wed, 20 May 2015 01:03:50 +0000 (18:03 -0700)]
Make OSPF compliant to the last sentence of this section in RFC 2328
9.5 Sending Hello packets
Hello packets are sent out each functioning router interface.
They are used to discover and maintain neighbor
relationships.[6] On broadcast and NBMA networks, Hello Packets
are also used to elect the Designated Router and Backup
Designated Router.
The format of an Hello packet is detailed in Section A.3.2. The
Hello Packet contains the router's Router Priority (used in
choosing the Designated Router), and the interval between Hello
Packets sent out the interface (HelloInterval). The Hello
Packet also indicates how often a neighbor must be heard from to
remain active (RouterDeadInterval). Both HelloInterval and
RouterDeadInterval must be the same for all routers attached to
a common network. The Hello packet also contains the IP address
mask of the attached network (Network Mask). On unnumbered
point-to-point networks and on virtual links this field should
be set to 0.0.0.0.
Donald Sharp [Wed, 20 May 2015 01:03:50 +0000 (18:03 -0700)]
When internal operations are performed (e.g., best-path selection, next-hop
change processing etc.) that refer to the BGP instance, the correct BGP
instance must be referenced and not the default BGP instance. The default
BGP instance is the first instance on the instance list. In a scenario
where one BGP instance is deleted (through operator action such as a
"no router bgp" command) and another instance exists or is created, there
may still be events in-flight that need to be processed against the
deleted instance. Trying to process these against the default instance
is erroneous. The calls to bgp_get_default() must be limited to the user
interface (vtysh) context.
Donald Sharp [Wed, 20 May 2015 01:03:49 +0000 (18:03 -0700)]
bgpd: Disable connected check for next hop on eBGP peers
In the data center, in conjunction with next hop propagation for features
such as announcing VIP routes to load balancers and such, it is desired to
disable the connected route check even on ebgp peers with TTL of 1. This
patch is used to disable the check for all peers instead of the peer by
peer check that is currently supported. Furthermore, the existing
disable-connected-check is different from how Cisco implements this feature.
So, we add this new flag to avoid reliance on the existing flag.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com> Reviewed-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:49 +0000 (18:03 -0700)]
BGP: Use the new value of dynamic capability in Open
The value for dynamic capability used in BGP open during capability
negotiation is a deprecated value. Thus, interop with other systems
is broken. This patch fixes that by advertising both the old and new
values. This ensures interop with older versions of quagga and other
non-quagga systems.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:49 +0000 (18:03 -0700)]
bgpd: Add route-map support for set ip next-hop unchanged
In the data center, where load balancers are announced as VIPs, and eBGP
is used as the routing protocol, this feature is required to ensure that
VIP announcements can be made from anywhere the operator sees fit.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com> Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:47 +0000 (18:03 -0700)]
This patch adds support for allowing BGP to create and bring up neighbor
sessions dynamically. The operator configures a range of neighbor addresses
to which peering is allowed. The ranges are configured as subnets and
multiple ranges are allowed. Each range is associated with a peer-group
so that additional parameters can be configured.
BGP neighbor sessions are dynamically created when connections are initiated
by remote neighbors whose addresses fall within a configured range. The
sessions are deleted when the BGP connection terminates.
A limit on the number of neighbors allowed from each range of addresses
can be specified.
IPv4 and IPv6 peering is supported. Over the peering, any of the address
families configured for the peer-group can be negotiated.
Donald Sharp [Wed, 20 May 2015 01:03:47 +0000 (18:03 -0700)]
BGP: Add dynamic update group support
This patch implements the 'update-groups' functionality in BGP. This is a
function that can significantly improve BGP performance for Update generation
and resultant network convergence. BGP Updates are formed for "groups" of
peers and then replicated and sent out to each peer rather than being formed
for each peer. Thus major BGP operations related to outbound policy
application, adj-out maintenance and actual Update packet formation
are optimized.
BGP update-groups dynamically groups peers together based on configuration
as well as run-time criteria. Thus, it is more flexible than update-formation
based on peer-groups, which relies on operator configuration.
[Note that peer-group based update formation has been introduced into BGP by
Cumulus but is currently intended only for specific releases.]
Donald Sharp [Wed, 20 May 2015 01:03:45 +0000 (18:03 -0700)]
Per AFI redist registrations
The problem is that zclient->redist[ZEBRA_ROUTE_MAX] used for storing a
client’s redist state, has no address-family qualification. This means
a client can only store its interest in a protocol (connected, static etc.),
but cant choose IPv4 or ipv6 with that. This hindered implementation on
client sides to manage redistribution of ipv4 and ipv6 both.
BGP's redistribution of protocols like connected/static is one such place.
One fix could be to overload this and flap the redist connection each time
any new afi is added for redist, but that may have side-effects on the
existing afi redist.
The cleaner way is to modify redist data-structure to also take AFI, and adjust
routines that deal with it, so that a client can register for a protocol
redistribution based on the AFI. BGP already maintains redistribution state
based on afi and protocol (bgp->redist[AFI_MAX][ZEBRA_ROUTE_MAX]). This patch
takes care of filling up the gap in zclient/zserv redistribution state to
also use AFI qualification.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:44 +0000 (18:03 -0700)]
During best path selection, if one of the candidates is a stale entry, do not
perform the neighbor address comparison as that information is invalid for
the stale entry. Attempting to perform the comparison results in a bgpd
exception.
Donald Sharp [Wed, 20 May 2015 01:03:43 +0000 (18:03 -0700)]
ISSUE:
LSAcks (for directed acks) are being sent to neighbor's unicast address.
RFC 2328 says:
"The IP destination address for the packet is selected as
follows. On physical point-to-point networks, the IP
destination is always set to the address AllSPFRouters"
Fix is to unconditionally set the destination address for LSAcks over
point-to-point links as AllSPFRouters. Quagga OSPF already has similar
change for OSPF DBD, LSUpdate and LSrequest packets.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:42 +0000 (18:03 -0700)]
zebra-redistribute-table.patch
Zebra: Redistribute routes from non-main kernel table to main.
This can be the basis for many interesting features such as variations
of redistribute ARP, using zebra as the RIB in the presence of multiple
routing protocol stacks etc. The code only supports IPv4 for now, but
the infrastructure is in place for IPv6.
Usage:
There is a new route type introduced by this model: TABLE. Routes
imported from alternate kernel tables will have their protocol type set to
TABLE.
Routes from alternate kernel tables MUST be first imported into the main
table via "ip import-table <table id>". They can then be redistributed via
a routing protocol via the "redistribute table" command. Each imported table
can an optional administrative distance specified. In Zebra, a route with a
lower distance is chosen over routes with a higher distance. So, distance
is how the user can choose to prioritize routes from a particular table over
routes from other tables or routes learnt another way in zebra.
Route maps for imported tables are specified via "ip protocol" command in
zebra. Route maps for redistributed routes within a routing protocol are
subject to the route map options supported by the protocol. The
"match source-protocol" option in route maps can match against "table"
to filter routes learnt from alternate kernel routing tables.
Signed-off-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
- etc/init.d/quagga is modified to support creating separate ospf daemon
process for each instance. Each individual instance is monitored by
watchquagga just like any protocol daemons.(requires initd-mi.patch).
- Vtysh is modified to able to connect to multiple daemons of the same
protocol (supported for OSPF only for now).
- ospfd is modified to remember the Instance-ID that its invoked with. For
the entire life of the process it caters to any command request that
matches that instance-ID (unless its a non instance specific command).
Routes/messages to zebra are tagged with instance-ID.
- zebra route/redistribute mechanisms are modified to work with
[protocol type + instance-id]
- bgpd now has ability to have multiple instance specific redistribution
for a protocol (OSPF only supported/tested for now).
- zlog ability to display instance-id besides the protocol/daemon name.
- Changes in other daemons are to because of the needed integration with
some of the modified APIs/routines. (Didn’t prefer replicating too many
separate instance specific APIs.)
- config/show/debug commands are modified to take instance-id argument
as appropriate.
Guidelines to start using multi-instance ospf
---------------------------------------------
The patch is backward compatible, i.e for any previous way of single ospf
deamon(router ospf <cr>) will continue to work as is, including all the
show commands etc.
To enable multiple instances, do the following:
1. service quagga stop
2. Modify /etc/quagga/daemons to add instance-ids of each desired
instance in the following format:
ospfd=“yes"
ospfd_instances="1,2,3"
assuming you want to enable 3 instances with those instance ids.
3. Create corresponding ospfd config files as ospfd-1.conf, ospfd-2.conf
and ospfd-3.conf.
4. service quagga start/restart
5. Verify that the deamons are started as expected. You should see
ospfd started with -n <instance-id> option.
ps –ef | grep quagga
With that /var/run/quagga/ should have ospfd-<instance-id>.pid and
ospfd-<instance-id>/vty to each instance.
6. vtysh to work with instances as you would with any other deamons.
7. Overall most quagga semantics are the same working with the instance
deamon, like it is for any other daemon.
NOTE:
To safeguard against errors leading to too many processes getting invoked,
a hard limit on number of instance-ids is in place, currently its 5.
Allowed instance-id range is <1-65535>
Once daemons are up, show running from vtysh should show the instance-id
of each daemon as 'router ospf <instance-id>’ (without needing explicit
configuration)
Instance-id can not be changed via vtysh, other router ospf configuration
is allowed as before.
Signed-off-by: Vipin Kumar <vipin@cumulusnetworks.com> Reviewed-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Dinesh G Dutt <ddutt@cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:41 +0000 (18:03 -0700)]
initd: initd-mi.patch
Support Multi-Instance protocol daemons initd
OSPFd is the first of the multi-instance daemons. This patch allows the
starting, stopping, restarting and monitoring of multiple instances of
the same protocol daemon.
Multiple instances are specified in the daemons file using a new variable:
ospfd_instances="1,2"
Absence of this variable means ospfd will start in legacy, single instance
mode. The original "ospfd=yes" line is still required.
Daemons are started with the "-n <instance>" option. Each daemon is named
"<daemon>-<instance>", for example "ospfd-1", "ospfd-2" etc. Similarly,
pid files are ospfd-1.pid and vty files are named ospfd-1.vty.
We're also introducing a new file, /etc/default/quagga to store the
default value for the maximum instances associated with a daemon.
watchquagga and others are unmodified and everything else just works once
this code is in place.
The code has been enhanced to support restarting watchquagga with only the
updated daemons when an individual daemon is stopped or started. For example,
without this patch, stopping just bgpd would terminate watchquagga even if
ospfd and zebra are still running. Similarly, starting just bgpd when ospfd
and zebra are running wouldn't update watchquagga to include bgpd. Furthermore,
when the daemons file is modified and a daemon is no longer deemed necessary
and quagga restarted, the daemon is not killed. For example, switching
ospfd=yes to ospfd=no and restarting the quagga will leave ospfd daemon
running. This case is also fixed with this patch.
However, adding a new instance to the ospfd_instances file and starting
just that instance will start just that instance and add it to watchquagga.
Similarly, a single instance maybe stopped or restarted.
Caveat emptor: With multi-instance daemons, stopping a single instance and then
starting a different instance will cause all instances to be monitored by
watchquagga i.e. all instances will be restarted, if necessary.
Signed-off-by: Dinesh G Dutt <ddutt at cumulusnetworks.com>
Donald Sharp [Wed, 20 May 2015 01:03:41 +0000 (18:03 -0700)]
ospf6d: ospfv3-show-cmds-instance-check-fix.patch
SYMPTOM:
If some of the ospfv3 commands like 'show ipv6 ospf6 route' are executed
with ospf6d daemon running but before having any ospfv3 configuration, then
ospf6d crash is seen.
ISSUE:
There are a few show commands, which are (unlike others) not checking if
ospf6 instance is initialized already.
FIX:
Add the missing checks, by using OSPF6_CMD_CHECK_RUNNING() in the commands
where its needed and not yet used.
Donald Sharp [Wed, 20 May 2015 01:03:40 +0000 (18:03 -0700)]
ospf6d: ospfv3-setsocket-retry.patch
SYMPTOM:
With quagga running on Linux, 'ifdown <if-name>' followed by 'ifup <ifname>
can cause OSPFv3 to not receive Hello packets on the interface.
ISSUE:
Operating System's interface IPv6 readiness may not be guaranteed at the
time of interface-up event. Thats because the ipv6 components in an OS may
also be listening to the same interface-up event that (in this case) is
relayed to OSPFv3.
In this failure case, setsockopt with option IPV6_JOIN_GROUP on the interface
returned EINVAL.
Error logs -
OSPF6: Zebra Interface state change: swp1 index 3 flags 11043 metric 1 mtu 1500
OSPF6: Interface Event swp1: [InterfaceUp]
OSPF6: Network: setsockopt (20) on ifindex 3 failed: Invalid argument
FIX:
To take care of this possible race condition, any address-family related
setting should be retried. Given it's a rare condition and window of this
race should be short, the patch adds a limited retry mechanism for the
IPV6 membership setting on the socket.
Donald Sharp [Wed, 20 May 2015 00:58:12 +0000 (17:58 -0700)]
Overhual BGP debugs
Summary of changes
- added an option to enable keepalive debugs for a specific peer
- added an option to enable inbound and/or outbound updates debugs for a specific peer
- added an option to enable update debugs for a specific prefix
- added an option to enable zebra debugs for a specific prefix
- combined "deb bgp", "deb bgp events" and "deb bgp fsm" into "deb bgp neighbor-events". "deb bgp neighbor-events" can be enabled for a specific peer.
- merged "deb bgp filters" into "deb bgp update"
- moved the per-peer logging to one central log file. We now have the ability to filter all verbose debugs on a per-peer and per-prefix basis so we no longer need to keep log files per-peer. This simplifies troubleshooting by keeping all BGP logs in one location. The use
r can then grep for the peer IP they are interested in if they wish to see the logs for a specific peer.
- Changed "show debugging" in isis to "show debugging isis" to be consistent with all other protocols. This was very confusing for the user because they would type "show debug" and expect to see a list of debugs enabled across all protocols.
- Removed "undebug" from the parser for BGP. Again this was to be consisten with all other protocols.
- Removed the "all" keyword from the BGP debug parser. The user can now do "no debug bgp" to disable all BGP debugs, before you had to type "no deb all bgp" which was confusing.
The new parse tree for BGP debugging is:
deb bgp as4
deb bgp as4 segment
deb bgp keepalives [A.B.C.D|WORD|X:X::X:X]
deb bgp neighbor-events [A.B.C.D|WORD|X:X::X:X]
deb bgp nht
deb bgp updates [in|out] [A.B.C.D|WORD|X:X::X:X]
deb bgp updates prefix [A.B.C.D/M|X:X::X:X/M]
deb bgp zebra
deb bgp zebra prefix [A.B.C.D/M|X:X::X:X/M]