Donald Sharp [Wed, 16 Nov 2016 17:00:40 +0000 (12:00 -0500)]
bgpd: More Extended nexthop fixing
Basically if we are reading in a cli with a extended-nexthop
and we have not received from zebra the interface we are working
on I believe we have a race condition where we are not
propagating the PEER_FLAG_CAPABILITY_ENHE in this case.
Modify the code to propagate even if we haven't found the
interface yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 15 Nov 2016 14:39:35 +0000 (09:39 -0500)]
bgpd: Fix occassional turn off of extended-nexthop for an if
Sometimes, like once every 400 iterations, when you restart
Quagga, extended-nexthop has been turned off for interface
based config( for 5549 ).
Examining the code, there is only 1 real path to setting
the PEER_FLAG_CAPABILITITY_ENHE and that is through
peer_conf_interface_get. Modify this code path
to always set the PEER_FLAG_CAPABILITY_ENHE if it is
not already set.
In addition, fix a possible pointer dereference.
Ticket: CM-12929 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Don Slice [Tue, 14 Feb 2017 17:15:40 +0000 (09:15 -0800)]
zebra: stop deregistering static nexthops unless removing the static
Problem reported was that with some overlapping static route configurations,
when the link went down the less specific static was not re-installed after
the link came back up. Determined that with the overlapping statics, we
would recursively resolve the next-hop temporarily thru the more specific
static route, but since the next-hop wasn't actually reachable, we would go
through the code that clears the nht information for the static completely.
This caused the nht code to no longer process the static route.
After reviewing the process, there doesn't seem to be any reason that the
static should be deregistered in that section of code. Removed the
deregister and the problem is resolved and not addional failures seen in
manual testing. zebra_test.py completed successfully and ospf and bgp smokes
completed with no new failures.
Ticket: CM-14873 Signed-off-by: Don Slice <dslice@cumulusnetworks.com> Reviewed-by: CCR-5696
Don Slice [Wed, 23 Nov 2016 19:58:27 +0000 (11:58 -0800)]
zebra: Move interfaces to default before deleting
Encountered a crash in zebra due to getting a delete on an SVI with
VRR configured. Since we don't actually do a delete but flag the interface
as inactive, slag VRR interfaces would remain on the vrf_iflist with a lock
count of zero, causing the crash. Since all other interface types are moved
to the default table before deleting, doing the same thing for any interfaces
that were left in the vrf.
Testing includes manual testing, bgp-min, ospf-min, vrf-min, bgp-smoke, and ospf-smoke.
All passed (first time or on rerun) or match known failures.
Ticket: CM-13288 Signed-off-by: Don Slice Reviewed-by: Donald Sharp
Chirag Shah [Tue, 24 Jan 2017 01:39:54 +0000 (17:39 -0800)]
pimd: fix ip pim hello option does not take in effect
If frr.conf file has pim hello option setup prior to pim sm under an interface, upon frr start/restart hello option cli replayed prior to sm command
from vtysh. Added a code in pim hello option cli handler to create pim vif if it does not exist.
Testing Done:
configure pim hello options before pim sm in frr.conf file and restart frr
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Daniel Walton [Wed, 17 May 2017 00:16:09 +0000 (00:16 +0000)]
tools: reload handle removal of entire address-family section under BGP
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
When an entire address-family section is removed from under BGP, we
cannot just issue 'no address-family foo bar' as address-family line
doesn't support 'no'. We have to delete the individual lines under the
address-family.
Daniel Walton [Tue, 16 May 2017 23:58:34 +0000 (23:58 +0000)]
bgpd: 'redistribute' triggers both IPv4 and IPv6 code paths
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com>
Whenever you did "redistribute" zebra would kick this off for ipv4 and
ipv6. No real issue other than this is sub-optimal
Daniel Walton [Tue, 16 May 2017 23:54:46 +0000 (23:54 +0000)]
bgpd: "neighbor swpX interface remote-as XYZ" is ignored
Signed-off-by: Daniel Walton <dwalton@cumulusnetworks.com> Reviewed-by: Don Slice <dslice@cumulusnetworks.com>
If you did:
neighbor swp1 interface
neighbor swp1 interface remote-as external
we would not set the remote-as. You could however still do
neighbor swp1 remote-as external
Don Slice [Wed, 5 Apr 2017 14:13:45 +0000 (07:13 -0700)]
bgpd: resolve ipv6 ecmp issue with vrfs and ll nexthop
Problem reported that ecmp wasn't working correctly in a vrf with
ipv6. Issue was that originator of the routes were sending the updates
with a link-local nexthop and nhlen of 16. In this particular case,
bgp_zebra_announce was using the wrong call to get the ifindex and
was not supplying the vrf. This caused ecmp to work only in the case
of the default vrf.
Ticket: CM-15545 Signed-off-by: Don Slice <dslice@cumulusnetworks.com> Reviewed-by: CCR-6017
Donald Sharp [Wed, 16 Nov 2016 17:00:40 +0000 (12:00 -0500)]
bgpd: More Extended nexthop fixing
Basically if we are reading in a cli with a extended-nexthop
and we have not received from zebra the interface we are working
on I believe we have a race condition where we are not
propagating the PEER_FLAG_CAPABILITY_ENHE in this case.
Modify the code to propagate even if we haven't found the
interface yet.
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Wed, 16 Nov 2016 16:28:11 +0000 (11:28 -0500)]
ospfd: Fix possible crash and wrong data being shown
When you have more than one ospf interface configured
to be used, we were attempting to reuse the
json_interface_sub pointer after we added it
to the json data structure.
Ticket: CM-13597 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 15 Nov 2016 14:39:35 +0000 (09:39 -0500)]
bgpd: Fix occassional turn off of extended-nexthop for an if
Sometimes, like once every 400 iterations, when you restart
Quagga, extended-nexthop has been turned off for interface
based config( for 5549 ).
Examining the code, there is only 1 real path to setting
the PEER_FLAG_CAPABILITITY_ENHE and that is through
peer_conf_interface_get. Modify this code path
to always set the PEER_FLAG_CAPABILITY_ENHE if it is
not already set.
In addition, fix a possible pointer dereference.
Ticket: CM-12929 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Sat, 6 May 2017 18:18:24 +0000 (11:18 -0700)]
pimd: IGMP Query Tx when Oth. Querier exp
Send IGMP General Query and Get Gen. Query timer,
once Other Querier timer expired.
Ticket:CM-13152
Reviewed By:CCR-6189
Testing Done:
tor-4# show ip igmp interface
Interface State Address V Querier Query Timer Uptime
swp2 up 70.1.1.2 3 other --:--:-- 00:06:45
swp3 up 80.1.1.2 3 local 00:00:35 00:22:52
Upon disabling IGMP on remote igmp interface, after other querier expiry
sends a query and elects as Querier
tor-4# show ip igmp interface
Interface State Address V Querier Query Timer Uptime
swp2 up 70.1.1.2 3 local 00:01:29 00:10:16
swp3 up 80.1.1.2 3 local 00:01:14 00:26:23
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Chirag Shah [Tue, 9 May 2017 18:30:43 +0000 (11:30 -0700)]
pimd: Avoid deleting SGRpt entry from PP->P state
-Upon Receving SGRpt Prune message, transitioning from Prune Pending state
to NOINFO state, ifchannel entry was getting deleted in prune pending timer
expiry. This can result in SGRpt ifhchannel deleted and recreated upon receving
triggered or periodic SGRpt received from downstream.
The automation test failed as it expected (check) SGRpt entry at RP after it triggers
SPT switchover.
- While transitioning from Prune-Pending state to NOINFO(Pruned) state, Trigger
SGRpt message towards RP.
- Add/del some of the debug traces
Ticket:CM-16057
Reviewed By:CCR-6198
Testing Done:
Rerun test08 multiple times and observed passing it.
Pim-smoke with hardnode
Ran 95 tests in 11219.420s
FAILED (SKIP=10, failures=4)
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Donald Sharp [Wed, 8 Mar 2017 15:27:21 +0000 (10:27 -0500)]
pimd: Fix core from json static mroute show
When we have no normal mroutes and only static mroutes
there exists a code path where we attempted to dereference
c_oil when we were showing static mroutes. Looks like
a cut-n-paste error, we should be using s_route->c_oil there
Ticket: CM-15013 Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 9 May 2017 20:18:04 +0000 (16:18 -0400)]
*: Remove ability to install frr_sudoers
If the user were to uncomment last line
and allow VTYSH_SHOW to be used as a non-root
account, this would allow arbitrary command completion
inside of vtysh via multiple -c ... -c .... lines
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Donald Sharp [Tue, 9 May 2017 20:18:04 +0000 (16:18 -0400)]
*: Remove ability to install frr_sudoers
If the user were to uncomment last line
and allow VTYSH_SHOW to be used as a non-root
account, this would allow arbitrary command completion
inside of vtysh via multiple -c ... -c .... lines
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Chirag Shah [Fri, 21 Apr 2017 22:08:03 +0000 (15:08 -0700)]
pimd: Fix WG/SGRpt & WG J/P processing
During processing of Join/Prune,
for a S,G entry, current state is SGRpt, when only *,G is
received, need to clear SGRpt and add/inherit the *,G OIF to S,G so
it can forward traffic to downstream where *,G is received.
Upon receiving SGRpt prune remove the inherited *,G OIF.
From, downstream router received *,G Prune along with SGRpt
prune. Avoid sending *,G and SGRpt Prune together.
Reset upstream_del reset ifchannel to NULL.
Testing Done:
Run failed smoke test of sending data packets, trigger SPT switchover,
*,G path received SGRpt later data traffic stopped S,G ages out from LHR, sends only
*,G join to upstream, verified S,G entry inherit the OIF.
Upon receiving SGRpt deletes inherited oif and retains in SGRpt state.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Chirag Shah [Wed, 26 Apr 2017 01:56:00 +0000 (18:56 -0700)]
pimd: replay IGMP static group upon if creation
Upon static IGMP configured port down igmp source info is destroyed.
Upon port up or quagga restart (if addr add) register IGMP sock with
port, source, group.
Testing Done:
Configure IGMPv3 Report on Host swpx, ifdown/ifup swpx,
verify ip mroute containng IGMP mroute
tor-1# show ip mroute
Source Group Proto Input Output TTL Uptime
30.1.1.1 225.1.1.21 IGMP swp2 swp3 1 00:00:05
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Chirag Shah [Thu, 4 May 2017 01:56:44 +0000 (18:56 -0700)]
pimd: fix pimd crash when pim interface disabled
Upon pim enabled disabled, current upstreams entries rpf update callback,
needs to check if old rpf is still pim enabled otherwise do not trigger
prune towards old rpf.
Testing Done:
Verified by disabling pim enabled rpf interface and
crash is not observed upon disabling pim on rpf interface.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Chirag Shah [Mon, 10 Apr 2017 20:27:56 +0000 (13:27 -0700)]
pimd: fix pim ecmp rebalance config write
ip pim ecmp and ip pim ecmp rebalance configuration CLIs were
not adding to Quagga.confg or running configuration.
Added both the configuration write in Config write handler.
Testing Done: Execute configuration cli and verified running config
and Quagga.conf file containing both configuration.
03# show running-config
Building configuration...
Current configuration:
!
ip multicast-routing
ip pim rp 6.0.0.9 230.0.0.0/16
ip pim join-prune-interval 61
ip pim ecmp
ip pim ecmp rebalance
!
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Chirag Shah [Thu, 27 Apr 2017 18:01:32 +0000 (11:01 -0700)]
pimd: fix channel_oil and upstream RPF in sync
During PIM Neighbor change/UP event, pim_scan_oil api
scans all channel oil to see any rpf impacted. Instead of
passing current upstream's RPF it passes current RPF as 0 and
does query to rib for nexhtop (without ECMP/Rebalance). This creates
inconsist RPF between Upstream and Channel oil.
In Channel Oil keep backward pointer to upstream DB and fetch up's
RPF and passed to channel_oil scan.
Decrement channel_oil ref_count in upstream_del when decrementing
up ref_count and it is not the last.
Created ECMP based FIB lookup API.
Testing Done:
Performed following testing on tester setup:
5 x LHR, 4 x MSDP Spines, 6 Sources each sending to 1023 groups from one of the spines.
Total send rate 8Mpps.
Test that caused problems was to reboot every device at the same time.
After fix performed 5 iterations of reboot devices and show no sign of the problem.
Signed-off-by: Chirag Shah <chirag@cumulusnetworks.com>
Don Slice [Fri, 5 May 2017 13:53:49 +0000 (06:53 -0700)]
bgpd: resolve crash displaying bgp vrf routing info
Problem uncovered with crash when entering the command "show ip bgp
vrf vrf1001 0.0.0.0". The crash was caused by a mistake incrementing
the index value in the vrf/view case. Manual testing completed and
failing test case now passes successfully.
Ticket: CM-16223 Signed-off-by: Don Slice <dslice@cumulusnetworks.com>
Reviewed By: Donald Sharp <sharpd@cumulusnetworks.com>